gpt4 book ai didi

r - 如何 group_by 变量并将时间减少到 10 秒的区间,从 13 :24:00 exactly and average for group_by variable 开始

转载 作者:行者123 更新时间:2023-12-03 18:23:34 25 4
gpt4 key购买 nike

我有 30 个传感器的 CO2 测量数据,这些传感器不会同时测量,也不会完全在同一时间开始。我想尽可能地对齐它们,所以我认为取 10s 的平均值可能是一个很好的解决方案。

在上一个问题中:Group by multiple variables and summarise dplyr我将每个传感器的时间缩短为 10 秒,并平均每个传感器在这 10 秒内的读数。听起来不错,但是,我意识到以下代码从每个传感器开始的任何时间开始减少时间,因此它们仍然没有对齐。我怎样才能对齐它们?

require(tidyverse)
require(lubridate)
df %>%
group_by(Sensor, BinnedTime = cut(DeviceTime, breaks="10 sec")) %>%
mutate(Concentration = mean(calCO2)) %>%
ungroup()

head(df)
# A tibble: 6 x 7
# Groups: BinnedTime [1]

Sensor Date Time calCO2 DeviceTime cuts BinnedTime
<fctr> <date> <time> <dbl> <dttm> <fctr> <chr>
1 N1 2019-02-12 13:24 400 2019-02-12 13:24:02 (0,10] 2019-02-12 13:24:02
2 N1 2019-02-12 13:24 400 2019-02-12 13:24:02 (0,10] 2019-02-12 13:24:02
3 N1 2019-02-12 13:24 400 2019-02-12 13:24:03 (0,10] 2019-02-12 13:24:03
4 N2 2019-02-12 13:24 400 2019-02-12 13:24:03 (0,10] 2019-02-12 13:24:02
5 N3 2019-02-12 13:24 400 2019-02-12 13:24:03 (0,10] 2019-02-12 13:24:02
6 N3 2019-02-12 13:24 400 2019-02-12 13:24:05 (0,10] 2019-02-12 13:24:04

编辑

我试过了:
dt<-seq(
from=as.POSIXct("2019-02-12 13:24:00", tz="GMT"),
to=as.POSIXct("2019-02-12 14:00:00", tz="GMT"),
by="10 sec"
)

cut(df$BinnedTime,dt)

但它给出了一个错误,说 x 必须是数字,所以我转换了 df$BinnedTimedt$dt到数字,这只会产生 NA。
cut(as.numeric(as.POSIXct(df$BinnedTime)), as.numeric(dt))

我错过了什么?

编辑 2

我有以下几点:
df$DeviceTime <- as.POSIXct(paste(d$Date, d$Time), format="%Y-%m-%d %H:%M:%S")

df<-df%>%
mutate(BinnedTime=floor_date(ymd_hms(DeviceTime),unit="10 sec"))%>%
group_by(Sensor)%>%
group_by(BinnedTime,add=TRUE)%>%
summarize(calCO2 = mean(na.omit(calCO2)))

我认为这就是我现在所追求的,但它并不优雅。

这是onedrive中的数据文件: df.txt until 30th March 19

最佳答案

library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date

df <- read_delim("https://gist.githubusercontent.com/ramiromagno/4347eefec2aa36ec94423b75b145fccb/raw/5c1b445686bd014ea3a1f0336433e3b364779766/df.txt", delim = " ", col_types = cols())

df$DeviceTime <- as.POSIXct(paste(df$Date, df$Time), format="%Y-%m-%d %H:%M:%S")

dt <- seq(
from = as.POSIXct("2019-02-12 13:24:00", tz = "GMT"),
to = as.POSIXct("2019-02-12 14:00:00", tz = "GMT"),
by = "10 sec"
)

df %>%
mutate(BinnedTime = cut(DeviceTime, breaks=dt)) %>%
group_by(Sensor)%>%
group_by(BinnedTime,add=TRUE)%>%
summarize(calCO2 = mean(na.omit(calCO2))) -> df2

df2
#> # A tibble: 7,557 x 3
#> # Groups: Sensor [?]
#> Sensor BinnedTime calCO2
#> <chr> <fct> <dbl>
#> 1 A1 2019-02-12 13:24:00 400
#> 2 A1 2019-02-12 13:24:10 401
#> 3 A1 2019-02-12 13:24:20 401
#> 4 A1 2019-02-12 13:24:30 401
#> 5 A1 2019-02-12 13:24:40 401
#> 6 A1 2019-02-12 13:24:50 400
#> 7 A1 2019-02-12 13:25:00 400
#> 8 A1 2019-02-12 13:25:10 398
#> 9 A1 2019-02-12 13:25:20 397
#> 10 A1 2019-02-12 13:25:30 394
#> # ... with 7,547 more rows

关于r - 如何 group_by 变量并将时间减少到 10 秒的区间,从 13 :24:00 exactly and average for group_by variable 开始,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54884478/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com