我有一个按时间戳和 ID 组织的数据框。对于每个 ID# 和每分钟,我有 8 列数据,每列都有四种不同类型的事件强度预测。预测可能是久坐、轻度、中度或剧烈。数据按以下格式排列。
id time x1 x2 x3
1 10:30 Moderate Light Light
1 10:31 Moderate Light Moderate
...
2 12:24 Light Light Light
2 12:25 Light Light Light
我希望获得每个 ID 的每个预测变量(x1、x2、x3...等)的每个事件强度的总和。使用上面的示例,我希望 reshape 我的数据,使其看起来像这样:
id Intensity x1 x2 x3
1 Light 0 2 1
1 Moderate 2 0 1
...
2 Light 2 2 2
2 Moderate 0 0 0
我的文件有大约 80 个 ID 和 8 个事件强度预测列 (x1-x8),以防万一。
最佳答案
library(tidyverse)
df %>%
select(-time) %>%
gather(key, intensity, -id) %>%
group_by(id, intensity, key) %>%
tally() %>%
spread(key, n) %>%
replace(is.na(.), 0)
输出是:
id intensity x1 x2 x3
1 1 Light 0 2 1
2 1 Moderate 3 0 2
3 1 Sedentary 1 0 1
4 1 Vigorous 0 2 0
5 2 Light 2 0 2
6 2 Moderate 1 1 0
7 2 Sedentary 0 2 0
8 2 Vigorous 0 0 1
示例数据:
df <- structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L), time = c("10:30",
"10:31", "10:32", "10:33", "12:24", "12:25", "12:26"), x1 = c("Moderate",
"Moderate", "Sedentary", "Moderate", "Light", "Moderate", "Light"
), x2 = c("Light", "Light", "Vigorous", "Vigorous", "Moderate",
"Sedentary", "Sedentary"), x3 = c("Light", "Moderate", "Moderate",
"Sedentary", "Light", "Light", "Vigorous")), class = "data.frame", row.names = c(NA,
-7L))
# id time x1 x2 x3
#1 1 10:30 Moderate Light Light
#2 1 10:31 Moderate Light Moderate
#3 1 10:32 Sedentary Vigorous Moderate
#4 1 10:33 Moderate Vigorous Sedentary
#5 2 12:24 Light Moderate Light
#6 2 12:25 Moderate Sedentary Light
#7 2 12:26 Light Sedentary Vigorous
关于通过 id 和事件强度预测(总和) reshape ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50308817/