对于以下示例数据集,我需要在首次购买(CustomerStatus = 已购买)后删除客户 (CustomerID) 的所有行。有些客户不购买该产品,我仍然想保留对这些客户的任何观察。保留日期变量很重要。
我在删除组内的行时遇到困难。原始数据的分组并不像这样好,我试图简化我遇到的问题。任何帮助表示赞赏。
我提供了一个示例数据集:
SalesPerson CustomerID Date CustomerStatus
Amanda 2000 1/5/2017 Intro
Amanda 2000 1/6/2017 Email
Amanda 2000 1/15/2017 PhoneCall
Amanda 2000 2/15/2017 Purchased
Amanda 2001 1/3/2017 Intro
Amanda 2001 1/4/2017 Email
Amanda 2001 1/12/2017 PhoneCall
Amanda 2001 1/15/2017 Conference
Amanda 2001 2/4/2017 Purchased
Amanda 2001 3/17/2017 Meeting
Amanda 2001 3/20/2017 Email
Kyle 2002 1/19/2017 Intro
Kyle 2002 1/20/2017 Email
Kyle 2002 1/21/2017 PhoneCall
Sharon 2006 1/8/2017 Intro
Sharon 2006 1/10/2017 Meeting
Sharon 2006 1/19/2017 Purchased
Sharon 2006 1/30/2017 Conference
Sharon 2006 2/10/2017 Purchased
输出应该是这样的:
SalesPerson CustomerID Date CustomerStatus
Amanda 2000 1/5/2017 Intro
Amanda 2000 1/6/2017 Email
Amanda 2000 1/15/2017 PhoneCall
Amanda 2000 2/15/2017 Purchased
Amanda 2001 1/3/2017 Intro
Amanda 2001 1/4/2017 Email
Amanda 2001 1/12/2017 PhoneCall
Amanda 2001 1/15/2017 Conference
Amanda 2001 2/4/2017 Purchased
Kyle 2002 1/19/2017 Intro
Kyle 2002 1/20/2017 Email
Kyle 2002 1/21/2017 PhoneCall
Sharon 2006 1/8/2017 Intro
Sharon 2006 1/10/2017 Meeting
Sharon 2006 1/19/2017 Purchased
最佳答案
我们可以按“SalesPerson”、“CustomerID”进行分组,创建逻辑索引来过滤
library(dplyr)
df1 %>%
group_by(SalesPerson, CustomerID) %>%
filter(cumsum(lag(CustomerStatus == "Purchased", default = FALSE))<1)
# A tibble: 15 x 4
# Groups: SalesPerson, CustomerID [4]
# SalesPerson CustomerID Date CustomerStatus
# <chr> <int> <chr> <chr>
# 1 Amanda 2000 1/5/2017 Intro
# 2 Amanda 2000 1/6/2017 Email
# 3 Amanda 2000 1/15/2017 PhoneCall
# 4 Amanda 2000 2/15/2017 Purchased
# 5 Amanda 2001 1/3/2017 Intro
# 6 Amanda 2001 1/4/2017 Email
# 7 Amanda 2001 1/12/2017 PhoneCall
# 8 Amanda 2001 1/15/2017 Conference
# 9 Amanda 2001 2/4/2017 Purchased
#10 Kyle 2002 1/19/2017 Intro
#11 Kyle 2002 1/20/2017 Email
#12 Kyle 2002 1/21/2017 PhoneCall
#13 Sharon 2006 1/8/2017 Intro
#14 Sharon 2006 1/10/2017 Meeting
#15 Sharon 2006 1/19/2017 Purchased
关于r - 在 R 中满足条件后过滤组中的后续行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44762058/