我正在尝试清理如下所示的数据集。
Name Mon Tue
abc 44970 44971
def NA 1
gh 1 NA
我想将每个“1”更改为该列的日期。 (稍后我会格式化) 期望的输出:
Name Mon Tue
abc 44970 44971
def NA 44971
gh 44970 NA
我尝试使用。
test$Mon[test$Mon == '1'] <- '44970'
这可以工作,但不能解决问题。我必须在周一至周五做这件事。以及 200 张已阅读的表格 - 所有表格都有不同的日期。我如何包含仅引用该列中的日期而不对日期进行硬编码的内容?
编辑例如日期:
> dput(head(test))
structure(list(delete = c("...1", "#", "1", "1", "4", "1"), Employee = c("CBRE Downtown Corporate Office Weekly Check-In",
"Employee Name", "Ad, Ant", "Ak, Ki", "Al, An",
"Am, Ni"), Manager = c("...3", "Manager Name", "Ha, Er",
"Dahlin, Alexandra", "Kruger, Katie", "Da, Al"), Dept. = c("...4",
"Department", "Tram", "A", "T",
"Shared Office Services"), Mon = c("44970", NA, NA, NA, NA, NA
), Tue = c("44971", NA, NA, "1", "1", "1"), Wed = c("44972",
NA, "1", NA, "1", NA), Thur = c("44973", NA, NA, NA, "1", NA),
Fri = c("44974", NA, NA, NA, NA, NA)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
最佳答案
我假设数据是数字,并且当值为 1 时我们希望复制每列的第一个值。
library(dplyr)
test %>%
mutate(across(Mon:Fri, ~ifelse(. == 1, first(.), .)))
# mutate(across(Mon:Fri, ~ifelse(. == 1, max(., na.rm = TRUE), .))) # alternate
# suggested by @M and @TarJae in case it's more reliable for this data
结果
# A tibble: 6 × 9
delete Employee Manager Dept. Mon Tue Wed Thur Fri
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 ...1 CBRE Downtown Corporate Office Weekly Check-In ...3 ...4 44970 44971 44972 44973 44974
2 # Employee Name Manager Name Department NA NA NA NA NA
3 1 Ad, Ant Ha, Er Tram NA NA 44972 NA NA
4 1 Ak, Ki Dahlin, Alexandra A NA 44971 NA NA NA
5 4 Al, An Kruger, Katie T NA 44971 44972 44973 NA
6 1 Am, Ni Da, Al Shared Office Se… NA 44971 NA NA NA
关于将列值替换为另一个值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76257107/