r - 如何使用R处理多天数据

标签 r

我在十天内得到了某种数据框。我想用这十天的数据来分析一般的事情。

例如,首先,我需要按时间间隔(例如10秒)将数据帧分成组。其次,分别计算 C 列和 D 列每组中值“1”的百分比。最后,在图形中绘制 C 列和 B 列随时间变化的百分比。

          time                B   C D
1  2014-08-04 00:00:04.0       red 0 0
2  2014-08-04 00:00:06.0       red 0 0
3  2014-08-04 00:00:06.0       red 1 0
4  2014-08-04 00:00:06.2       red 0 0
5  2014-08-04 00:00:06.5       red 0 0
6  2014-08-04 00:00:07.0       red 0 1
7  2014-08-04 00:00:07.7       red 0 0
8  2014-08-04 00:00:16.0       red 0 0
9  2014-08-04 00:00:17.0       red 1 0
10 2014-08-04 00:00:18.0       red 0 0
11 2014-08-04 00:00:22.0       red 0 0
12 2014-08-04 00:00:22.0       red 0 0
13 2014-08-04 00:00:22.2       red 0 0
14 2014-08-04 00:00:25.0       red 1 0
15 2014-08-04 00:00:27.0       red 1 0
16 2014-08-04 00:00:28.0       red 0 0
17 2014-08-04 00:00:29.0 red/amber 1 0
18 2014-08-04 00:00:29.0 red/amber 1 1
19 2014-08-04 00:00:30.0     green 0 0
20 2014-08-04 00:00:40.0     green 0 1
21 2014-08-04 00:00:42.4     green 0 0
22 2014-08-04 00:00:43.0     green 0 0
23 2014-08-04 00:00:50.0       red 1 0
24 2014-08-04 00:00:51.2       red 0 0
25 2014-08-04 00:00:52.0       red 0 1
26 2014-08-04 00:00:52.0       red 1 0
27 2014-08-04 00:00:52.2       red 1 0
28 2014-08-04 00:00:52.9       red 1 1
29 2014-08-04 00:00:53.0       red 0 0
30 2014-08-04 00:00:59.0       red 0 1
31 2014-08-04 00:01:02.0       red 0 1
32 2014-08-04 00:01:03.2       red 0 1
33 2014-08-04 00:01:04.0       red 1 1
34 2014-08-04 00:01:06.4       red 0 1
35 2014-08-04 00:01:07.5       red 1 1
36 2014-08-04 00:01:08.0       red 0 1
37 2014-08-04 00:01:08.2       red 0 1
38 2014-08-04 00:01:08.4       red 0 1
39 2014-08-04 00:01:11.0       red 0 1
40 2014-08-04 00:01:13.0       red 0 1
41 2014-08-04 00:01:14.0       red 0 1
42 2014-08-04 00:01:15.0 red/amber 0 1
43 2014-08-04 00:01:15.0 red/amber 0 1
44 2014-08-04 00:01:16.0     green 0 1
45 2014-08-04 00:01:21.0     green 0 0
46 2014-08-04 00:01:26.0     green 0 0
47 2014-08-04 00:01:31.0     amber 0 0
48 2014-08-04 00:01:31.0     amber 0 0
49 2014-08-04 00:01:34.0       red 0 0
50 2014-08-04 00:01:36.0       red 0 0

8月11日的数据:

           time           B     C    D
1  2014-08-11 00:00:02.0 red    0    0
2  2014-08-11 00:00:03.0 red    0    0
3  2014-08-11 00:00:04.0 red    0    0
4  2014-08-11 00:00:07.0 red    0    0
5  2014-08-11 00:00:08.0 red    0    0
6  2014-08-11 00:00:08.0 red    0    0
7  2014-08-11 00:00:08.2 red    0    0
8  2014-08-11 00:00:08.5 red    0    0
9  2014-08-11 00:00:08.9 red    0    0
10 2014-08-11 00:00:09.0 red    0    0
11 2014-08-11 00:00:09.5 red    0    0
12 2014-08-11 00:00:10.0 red    0    0
13 2014-08-11 00:00:10.2 red    0    0
14 2014-08-11 00:00:10.4 red    0    0
15 2014-08-11 00:00:10.5 red    0    0
16 2014-08-11 00:00:10.7 red    0    0
17 2014-08-11 00:00:11.7 red    0    0
18 2014-08-11 00:00:11.9 red    0    0
19 2014-08-11 00:00:12.0 red    0    0
20 2014-08-11 00:00:12.0 red    0    0
21 2014-08-11 00:00:12.2 red    0    0
22 2014-08-11 00:00:12.2 red    0    0
23 2014-08-11 00:00:12.5 red    0    0
24 2014-08-11 00:00:12.7 red    0    0
25 2014-08-11 00:00:13.0 red    0    0
26 2014-08-11 00:00:13.2 red    0    0
27 2014-08-11 00:00:13.2 red    0    0
28 2014-08-11 00:00:13.5 red    0    0
29 2014-08-11 00:00:13.7 red    0    0
30 2014-08-11 00:00:13.9 red    0    0
31 2014-08-11 00:00:14.2 red    0    0
32 2014-08-11 00:00:14.4 red    0    0
33 2014-08-11 00:00:14.7 red    0    0
34 2014-08-11 00:00:14.7 red    0    0
35 2014-08-11 00:00:15.0 red    0    0
36 2014-08-11 00:00:15.0 red    0    0
37 2014-08-11 00:00:15.2 red    0    0
38 2014-08-11 00:00:16.5 red    0    1
39 2014-08-11 00:00:17.0 red    0    1
40 2014-08-11 00:00:17.0 red    0    1
41 2014-08-11 00:00:17.9 red    0    1
42 2014-08-11 00:00:18.0 red    0    1
43 2014-08-11 00:00:18.0 red    0    1
44 2014-08-11 00:00:18.2 red    0    1
45 2014-08-11 00:00:18.4 red    0    1
46 2014-08-11 00:00:18.5 red    0    1
47 2014-08-11 00:00:18.7 red    0    1
48 2014-08-11 00:00:19.0 red    0    1
49 2014-08-11 00:00:19.2 red    0    1
50 2014-08-11 00:00:19.7 red    0    1

我只知道如何处理一日数据。 the plot for this 但是如何绘制几天的十天数据? x 轴只是时间部分,不包括日期以获取这些天的总体结果。这意味着合并所有天的数据以获得平均结果

这只是一个例子,每当我需要处理多天的数据来平均一般结果时,我都会做很多困难的事情。谢谢帮助。 T^T

library(reshape2)
library(ggplot2)
df$time <- as.POSIXct(cut(as.POSIXct(df$time), "10 secs"))
df.mlt <- melt(df, id.var=c("time", "B"))
ggplot(df.mlt, aes(x=time, y=value, color=variable)) + 
  stat_summary(geom="point", fun.y=mean, shape=1) + 
  stat_smooth()

最佳答案

对于前两部分,您可以尝试:(这里以 10 秒为单位进行分割,不清楚是否要包括天数)

library(data.table)
df$time1 <- as.POSIXct(cut(as.POSIXct(df$time, format= "%Y-%m-%d %H:%M:%S"), "10 secs"))
df1 <- df[,-1] #deleted the time column
dt <- data.table(df1, key='time1')
dt1 <-  dt[, list(C1=round(100*(sum(C==1)/.N),2), D1=round(100*(sum(D==1)/.N),2)), by=time1]
dt1
 #                 time1    C1     D1
 #1: 2014-08-04 00:00:04 14.29  14.29
 #2: 2014-08-04 00:00:14 16.67   0.00
 #3: 2014-08-04 00:00:24 66.67  16.67
 #4: 2014-08-04 00:00:34  0.00  33.33
 #5: 2014-08-04 00:00:44 57.14  28.57
 #6: 2014-08-04 00:00:54  0.00 100.00
 #7: 2014-08-04 00:01:04 25.00 100.00
 #8: 2014-08-04 00:01:14  0.00  80.00
 #9: 2014-08-04 00:01:24  0.00   0.00
#10: 2014-08-04 00:01:34  0.00   0.00
#11: 2014-08-10 23:59:54  0.00   0.00
#12: 2014-08-11 00:00:04  0.00   0.00
#13: 2014-08-11 00:00:14  0.00  65.00

更新

 dt1[, list(C1=mean(C1), D1= mean(D1)), by=list(timeN=gsub("^.*\\s+","", time1))]
 #      timeN     C1      D1
 #1: 00:00:04  7.145   7.145
 #2: 00:00:14  8.335  32.500
 #3: 00:00:24 66.670  16.670
 #4: 00:00:34  0.000  33.330
 #5: 00:00:44 57.140  28.570
 #6: 00:00:54  0.000 100.000
 #7: 00:01:04 25.000 100.000
 #8: 00:01:14  0.000  80.000
 #9: 00:01:24  0.000   0.000
#10: 00:01:34  0.000   0.000
#11: 23:59:54  0.000   0.000

更新2

我想你需要这个。值(value)观上有差异。在之前的例子中,它只是比例的平均值。在这里,我获取了几天内每个cut时间间隔的比例。可能这样更正确。

 df1$timeN <- gsub("^.*\\s+", "", df1$time1)
 dt <- data.table(df1, key='timeN')
 dt1 <- dt[,list(C1=round(100*(sum(C==1)/.N),2), D1=round(100*(sum(D==1)/.N),2)), by=timeN]

 dt1
 #      timeN    C1     D1
 #1: 00:00:04 14.29  14.29
 #2: 00:00:14 16.67   0.00
 #3: 00:00:24 66.67  16.67
 #4: 00:00:34  0.00  33.33
 #5: 00:00:44 57.14  28.57
 #6: 00:00:54  0.00 100.00
 #7: 00:01:04 25.00 100.00
 #8: 00:01:14  0.00  80.00
 #9: 00:01:24  0.00   0.00
#10: 00:01:34  0.00   0.00

关于r - 如何使用R处理多天数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25381351/

相关文章:

c++ - 使用 Rcpp 模块公开带有引用参数的 C++ 类方法时出错

R:更改配色方案时ggplot2图例消失

r - Network_plot 'names' 属性

r - 将公司 Logo 添加到 ShinyDashboard 标题

R 在 Ubuntu Linux 中从剪贴板复制

从字符串中删除 URL

r - 更改导航栏 flexdashboard 的方向

r - 添加重复序列的分组指示符

r - 使用 purrr 按向量中的值进行过滤

r - 将iframe嵌入 Shiny 的应用程序中