r - 如何过滤具有多个条件的行？

我有一个数据集，如下所示。

df <- tribble(
  ~shop_id,  ~id,      ~key,        ~date,      ~status, 
  "1",       "10",     "abc",    '2020-05-04',   'good',
  "1",       "10",     "def",    '2020-05-03',   'normal',
  "1",       "10",     "glm",    '2020-05-03',   'bad',
  "1",       "20",     "ksr",    '2020-05-01',   'bad',
  "1",       "20",     "tyz",    '2020-05-02',   'bad',
  "2",       "20",     "uyv",    '2020-05-01',   'good',
  "2",       "20",     "mys",    '2020-05-01',   'normal',
  "2",       "30",     "ert",    '2020-05-01',   'bad',
  "2",       "40",     "yer",    '2020-05-05',   'good',
  "2",       "40",     "tet",    '2020-05-05',   'bad',
)

现在，我想用以下条件过滤数据:

按 shop_id 和 id 对数据进行分组，然后查看日期。那么，

如果当 status == 'bad' 时 date 为最小值，则删除这些行。例如，由于这种情况，前三行已从数据集中删除。 (请参阅desired_df)
如果只有'bad'状态，则保留所有行。由于这种情况，第 4 行和第 5 行留在所需的数据集中。
如果当 status == 'bad' 时各行的日期相同，则将两行保留在所需的数据集中。

换句话说，我只想查看对 shop_id 和 id 进行分组后“不良”状态的日期为最大值时的行。但是，当两种状态中的状态日期相同时，请保留行。


desired_df <- tribble(
  ~shop_id,  ~id,      ~key,      ~date,      ~status, 
  "1",       "20",     "ksr",   '2020-05-01',   'bad',
  "1",       "20",     "tyz",   '2020-05-02',   'bad',
  "2",       "30",     "ert",   '2020-05-01',   'bad',
  "2",       "40",     "yer",   '2020-05-05',   'good',
  "2",       "40",     "tet",   '2020-05-05',   'bad', 
)

任何帮助或协助将不胜感激!

最佳答案

一种方法是使用case_when。

df %>%
  mutate(date = ymd(date)) %>%
  group_by(shop_id,id) %>% 
  mutate(filter = case_when(all(status != "bad") ~ FALSE,
                            all(status == "bad") ~ TRUE,
                            all(status[date == min(date)] == "bad") ~ FALSE,
                            any(status[date == min(date)] == "good") ~ TRUE,
                            TRUE ~ FALSE)) %>%
  filter(filter == TRUE) %>% 
  dplyr::select(-filter)

# A tibble: 5 x 5
# Groups:   shop_id, id [3]
  shop_id id    key   date       status
  <chr>   <chr> <chr> <date>     <chr> 
1 1       20    ksr   2020-05-01 bad   
2 1       20    tyz   2020-05-02 bad   
3 2       30    ert   2020-05-01 bad   
4 2       40    yer   2020-05-05 good  
5 2       40    tet   2020-05-05 bad

关于r - 如何过滤具有多个条件的行？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/61624284/

r - 如何过滤具有多个条件的行？

上一篇：python - 如果测试用例失败，则将子进程的标准输出添加到 JSON 报告

下一篇：laravel - 从不同文件夹导入组件Vue js