我正在尝试在 R 中进行库存计算,这需要对每个 Mat-Plant 组合进行逐行计算。这是一个测试数据集 -
df <- structure(list(Mat = c("A", "A", "A", "A", "A", "A", "B", "B"
), Plant = c("P1", "P1", "P1", "P2", "P2", "P2", "P1", "P1"),
Day = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L), UU = c(0L, 10L,
0L, 0L, 0L, 120L, 10L, 0L), CumDailyFcst = c(11L, 22L, 33L,
0L, 5L, 10L, 20L, 50L)), .Names = c("Mat", "Plant", "Day",
"UU", "CumDailyFcst"), class = "data.frame", row.names = c(NA,
-8L))
Mat Plant Day UU CumDailyFcst
1 A P1 1 0 11
2 A P1 2 10 22
3 A P1 3 0 33
4 A P2 1 0 0
5 A P2 2 0 5
6 A P2 3 120 10
7 B P1 1 10 20
8 B P1 2 0 50
我需要一个新字段“EffectiveFcst”,以便
when Day = 1 then EffectiveFcst = CumDailyFcst
在接下来的日子里——这是所需的输出 -
Mat Plant Day UU CumDailyFcst EffectiveFcst
1 A P1 1 0 11 11
2 A P1 2 10 22 22
3 A P1 3 0 33 23
4 A P2 1 0 0 0
5 A P2 2 0 5 5
6 A P2 3 120 10 10
7 B P1 1 10 20 20
8 B P1 2 0 50 40
我目前正在使用 for 循环,但实际表是 >300K 行,所以希望用
tidyverse
做到这一点以获得更优雅和更快的方法。尝试了以下但没有解决 -group_by(df, Mat, Plant) %>%
mutate(EffectiveFcst = ifelse(row_number()==1, CumDailyFcst, 0)) %>%
mutate(EffectiveFcst = ifelse(row_number() > 1, CumDailyFcst - lag(CumDailyFcst, default = 0) + max(lag(EffectiveFcst, default = 0) - lag(UU, default = 0), 0), EffectiveFcst)) %>%
print(n = nrow(.))
最佳答案
我们可以使用 accumulate
来自 purrr
library(tidyverse)
df %>%
group_by(Mat, Plant) %>%
mutate(EffectiveFcst = accumulate(CumDailyFcst - lag(UU, default = 0), ~
.y , .init = first(CumDailyFcst))[-1] )
# A tibble: 8 x 6
# Groups: Mat, Plant [3]
# Mat Plant Day UU CumDailyFcst EffectiveFcst
# <chr> <chr> <int> <int> <int> <dbl>
#1 A P1 1 0 11 11
#2 A P1 2 10 22 22
#3 A P1 3 0 33 23
#4 A P2 1 0 0 0
#5 A P2 2 0 5 5
#6 A P2 3 120 10 10
#7 B P1 1 10 20 20
#8 B P1 2 0 50 40
关于r - tidyverse:按组逐行计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52728422/