我试图让我的每个 id/year/month 行都具有对应于所有七个工作日的所有行,其中 NA 为“缺少工作日”。
这是数据框和我完成这项任务的尝试:
> df
id year month weekday amount
1 1 2015 1 Friday 3650.43
2 2 2015 1 Monday 1271.12
3 1 2015 2 Friday 1315.79
4 2 2015 2 Monday 2195.37
> wday
weekday
1 Friday
2 Saturday
3 Wednesday
4 Sunday
5 Tuesday
6 Monday
7 Thursday
尝试使用 group_by() 和正确的连接。但是,它并没有产生我认为的那样。有没有一种简单的方法可以达到我所追求的结果?
> df <- df %>% group_by(id, year, month) %>% right_join(wday)
Joining by: "weekday"
> df
Source: local data frame [9 x 5]
Groups: id, year, month [?]
id year month weekday amount
(dbl) (int) (int) (chr) (dbl)
1 1 2015 1 Friday 3650.43
2 1 2015 2 Friday 1315.79
3 NA NA NA Saturday NA
4 NA NA NA Wednesday NA
5 NA NA NA Sunday NA
6 NA NA NA Tuesday NA
7 2 2015 1 Monday 1271.12
8 2 2015 2 Monday 2195.37
9 NA NA NA Thursday NA
我想要每个 ID/年/月组合 7 行,其中缺少工作日的数量将为 NA(或理想情况下为零,但我知道如何通过 mutate() 获得它)。
生成的数据框应如下所示:
> df
id year month weekday amount
1 1 2015 1 Friday 3650.43
2 1 2015 1 Monday 0.00
3 1 2015 1 Saturday 0.00
4 1 2015 1 Sunday 0.00
5 1 2015 1 Thursday 0.00
6 1 2015 1 Tuesday 0.00
7 1 2015 1 Wednesday 0.00
8 1 2015 2 Friday 1315.79
9 1 2015 2 Monday 0.00
10 1 2015 2 Saturday 0.00
11 1 2015 2 Sunday 0.00
12 1 2015 2 Thursday 0.00
13 1 2015 2 Tuesday 0.00
14 1 2015 2 Wednesday 0.00
15 2 2015 1 Friday 0.00
16 2 2015 1 Monday 1271.12
17 2 2015 1 Saturday 0.00
18 2 2015 1 Sunday 0.00
19 2 2015 1 Thursday 0.00
20 2 2015 1 Tuesday 0.00
21 2 2015 1 Wednesday 0.00
22 2 2015 2 Friday 0.00
23 2 2015 2 Monday 2195.37
24 2 2015 2 Saturday 0.00
25 2 2015 2 Sunday 0.00
26 2 2015 2 Thursday 0.00
27 2 2015 2 Tuesday 0.00
28 2 2015 2 Wednesday 0.00
最佳答案
我们可以使用 expand.grid
expand.grid(c(lapply(df[1:3], unique), wday['weekday'])) %>%
left_join(., df) %>%
mutate(amount=replace(amount, is.na(amount), 0)) %>%
arrange(id, year, month, weekday)
# id year month weekday amount
#1 1 2015 1 Friday 3650.43
#2 1 2015 1 Monday 0.00
#3 1 2015 1 Saturday 0.00
#4 1 2015 1 Sunday 0.00
#5 1 2015 1 Thursday 0.00
#6 1 2015 1 Tuesday 0.00
#7 1 2015 1 Wednesday 0.00
#8 1 2015 2 Friday 1315.79
#9 1 2015 2 Monday 0.00
#10 1 2015 2 Saturday 0.00
#11 1 2015 2 Sunday 0.00
#12 1 2015 2 Thursday 0.00
#13 1 2015 2 Tuesday 0.00
#14 1 2015 2 Wednesday 0.00
#15 2 2015 1 Friday 0.00
#16 2 2015 1 Monday 1271.12
#17 2 2015 1 Saturday 0.00
#18 2 2015 1 Sunday 0.00
#19 2 2015 1 Thursday 0.00
#20 2 2015 1 Tuesday 0.00
#21 2 2015 1 Wednesday 0.00
#22 2 2015 2 Friday 0.00
#23 2 2015 2 Monday 2195.37
#24 2 2015 2 Saturday 0.00
#25 2 2015 2 Sunday 0.00
#26 2 2015 2 Thursday 0.00
#27 2 2015 2 Tuesday 0.00
#28 2 2015 2 Wednesday 0.00
关于r - dplyr - 在 group_by 后右加入没有产生所需/预期的结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34382345/