我有如下所示的简单时间跟踪数据:
df = tribble(~Date, ~Name, ~Team, ~Status, ~Hours_Type, ~Hours, ~Standard, ~Deficit, ~Overtime, ~Leave,
"April 16 2021", "Jeff", "Coastal", "FT", "Billable", 40, 40, 0, 0, 0,
"April 23 2021", "Jeff", "Coastal", "FT", "Billable", 40, 40, 0, 0, 0,
"April 16 2021", "Jeff", "Coastal", "FT", "Leave", 0, 0, 0, 0, 0,
"April 23 2021", "Jeff", "Coastal", "FT", "Leave", 0, 0, 0, 0, 0,
"April 16 2021", "Megan", "Coastal", "FT", "Billable", 40, 40, 0, 0, 0,
"April 23 2021", "Megan", "Coastal", "FT", "Billable", 40, 40, 0, 0, 0,
"April 16 2021", "Megan", "Coastal", "FT", "Leave", 0, 0, 0, 0, 0,
"April 23 2021", "Megan", "Coastal", "FT", "Leave", 0, 0, 0, 0, 0,
"April 16 2021", "Minden", "Coastal", "FT", "Billable", 16, 16, 24, 0, 0,
"April 23 2021", "Minden", "Coastal", "FT", "Billable", 28, 28, 12, 0, 0,
"April 16 2021", "Minden", "Coastal", "FT", "Leave", 24, 0, 0, 0, 24,
"April 23 2021", "Minden", "Coastal", "FT", "Leave", 0, 0, 0, 0, 0)
# A tibble: 12 x 10
Date Name Team Status Hours_Type Hours Standard Deficit Overtime Leave
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 April 16 2021 Jeff Coastal FT Billable 40 40 0 0 0
2 April 23 2021 Jeff Coastal FT Billable 40 40 0 0 0
3 April 16 2021 Jeff Coastal FT Leave 0 0 0 0 0
4 April 23 2021 Jeff Coastal FT Leave 0 0 0 0 0
5 April 16 2021 Megan Coastal FT Billable 40 40 0 0 0
6 April 23 2021 Megan Coastal FT Billable 40 40 0 0 0
7 April 16 2021 Megan Coastal FT Leave 0 0 0 0 0
8 April 23 2021 Megan Coastal FT Leave 0 0 0 0 0
9 April 16 2021 Minden Coastal FT Billable 16 16 24 0 0
10 April 23 2021 Minden Coastal FT Billable 28 28 12 0 0
11 April 16 2021 Minden Coastal FT Leave 24 0 0 0 24
12 April 23 2021 Minden Coastal FT Leave 0 0 0 0 0
如何查询 Leave
列并检查同一 Date
中的 Deficit
列是否实际上应该为 0,因为它实际上不是赤字,因为它是由同一天的休假弥补的?
例如,我如何让 R 检查 Deficit
、Leave
、Name
和 Date
列为了修改此表并将 Minden 的 4 月 16 日 24 小时 赤字
更改为 0(第 9 行),因为他在 4 月 16 日的 24 小时休假
中涵盖了它(第 11 行)?
这将是预期的结果,相关代码可以在整个数据集中泛化:
# A tibble: 12 x 10
Date Name Team Status Hours_Type Hours Standard Deficit Overtime Leave
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 April 16 2021 Jeff Coastal FT Billable 40 40 0 0 0
2 April 23 2021 Jeff Coastal FT Billable 40 40 0 0 0
3 April 16 2021 Jeff Coastal FT Leave 0 0 0 0 0
4 April 23 2021 Jeff Coastal FT Leave 0 0 0 0 0
5 April 16 2021 Megan Coastal FT Billable 40 40 0 0 0
6 April 23 2021 Megan Coastal FT Billable 40 40 0 0 0
7 April 16 2021 Megan Coastal FT Leave 0 0 0 0 0
8 April 23 2021 Megan Coastal FT Leave 0 0 0 0 0
9 April 16 2021 Minden Coastal FT Billable 16 16 0 0 0
10 April 23 2021 Minden Coastal FT Billable 28 28 12 0 0
11 April 16 2021 Minden Coastal FT Leave 24 0 0 0 24
12 April 23 2021 Minden Coastal FT Leave 0 0 0 0 0
注意:我必须保留 Leave
列,因为我在堆叠条形图中使用它来可视化此数据 - 请参阅本例中的 24 Deficit
对于 Minden 实际上应该离开,但我不知道如何自动进行此更改,只能手动进行:
最佳答案
我认为此策略最有效(尽管您的示例不包括其他可能的场景)
df %>% group_by(Date, Name) %>%
mutate(Deficit = ifelse(Hours_Type == "Billable", Deficit - Leave[Hours_Type == "Leave"], Deficit))
# A tibble: 12 x 10
# Groups: Date, Name [6]
Date Name Team Status Hours_Type Hours Standard Deficit Overtime Leave
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 April 16 2021 Jeff Coastal FT Billable 40 40 0 0 0
2 April 23 2021 Jeff Coastal FT Billable 40 40 0 0 0
3 April 16 2021 Jeff Coastal FT Leave 0 0 0 0 0
4 April 23 2021 Jeff Coastal FT Leave 0 0 0 0 0
5 April 16 2021 Megan Coastal FT Billable 40 40 0 0 0
6 April 23 2021 Megan Coastal FT Billable 40 40 0 0 0
7 April 16 2021 Megan Coastal FT Leave 0 0 0 0 0
8 April 23 2021 Megan Coastal FT Leave 0 0 0 0 0
9 April 16 2021 Minden Coastal FT Billable 16 16 0 0 0
10 April 23 2021 Minden Coastal FT Billable 28 28 12 0 0
11 April 16 2021 Minden Coastal FT Leave 24 0 0 0 24
12 April 23 2021 Minden Coastal FT Leave 0 0 0 0 0
让我们换一个场景,Jeff 在 16 号缺勤 12 个小时,请假 6 个小时。梅根在 23 日有 15 个小时的缺勤,没有任何休假。 df
在这种情况下将是
# A tibble: 12 x 10
Date Name Team Status Hours_Type Hours Standard Deficit Overtime Leave
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 April 16 2021 Jeff Coastal FT Billable 40 40 12 0 0
2 April 23 2021 Jeff Coastal FT Billable 40 40 0 0 0
3 April 16 2021 Jeff Coastal FT Leave 0 0 0 0 6
4 April 23 2021 Jeff Coastal FT Leave 0 0 0 0 0
5 April 16 2021 Megan Coastal FT Billable 40 40 0 0 0
6 April 23 2021 Megan Coastal FT Billable 40 40 15 0 0
7 April 16 2021 Megan Coastal FT Leave 0 0 0 0 0
8 April 23 2021 Megan Coastal FT Leave 0 0 0 0 0
9 April 16 2021 Minden Coastal FT Billable 16 16 24 0 0
10 April 23 2021 Minden Coastal FT Billable 28 28 12 0 0
11 April 16 2021 Minden Coastal FT Leave 24 0 0 0 24
12 April 23 2021 Minden Coastal FT Leave 0 0 0 0 0
并输出
# A tibble: 12 x 10
# Groups: Date, Name [6]
Date Name Team Status Hours_Type Hours Standard Deficit Overtime Leave
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 April 16 2021 Jeff Coastal FT Billable 40 40 6 0 0
2 April 23 2021 Jeff Coastal FT Billable 40 40 0 0 0
3 April 16 2021 Jeff Coastal FT Leave 0 0 0 0 6
4 April 23 2021 Jeff Coastal FT Leave 0 0 0 0 0
5 April 16 2021 Megan Coastal FT Billable 40 40 0 0 0
6 April 23 2021 Megan Coastal FT Billable 40 40 15 0 0
7 April 16 2021 Megan Coastal FT Leave 0 0 0 0 0
8 April 23 2021 Megan Coastal FT Leave 0 0 0 0 0
9 April 16 2021 Minden Coastal FT Billable 16 16 0 0 0
10 April 23 2021 Minden Coastal FT Billable 28 28 12 0 0
11 April 16 2021 Minden Coastal FT Leave 24 0 0 0 24
12 April 23 2021 Minden Coastal FT Leave 0 0 0 0 0
它应该符合您的期望和提供的逻辑。修改后的场景将是(仅计费时间)
关于r - 如何查询数据框并根据 R 中的另一列更改数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67119391/