我正在使用pmax
和pmin
从每一行中提取最大值和最小值。我有一些统计上不显着的值,这些值被 <> 包围。由于某种原因,pmax
和pmin
仍然考虑这些值,然后我无法计算显着值之间的差异。下面是一个例子:
我想要!xx!当我执行以下操作时,不包含值:
DF1 = data.frame(ID=c("A","B","C","D","E"),
Var1=c("1","20","!20!","NA","!10!"),
Var2=c("!5!","NA","10","NA","NA"),
Var3=c("NA","NA","NA","30","NA"),
Var4=c("10","NA","NA","NA","NA"),
Var5=c("NA","!50!","20","NA","NA"))
DF1$max <- pmax(DF1$Var1,DF1$Var2,DF1$Var3,DF1$Var4,na.rm = TRUE)
DF1$min <- pmin(DF1$Var1,DF1$Var2,DF1$Var3,DF1$Var4,na.rm = TRUE)
这导致我得到以下结果:
当我想要以下内容时:
如何防止 !xx!值被 pmax
占用和pmin
?我感谢任何帮助!
最佳答案
假设您的“NA”
确实是NA
(不是字符串文字):
DF1[-1] <- lapply(DF1[-1], function(z) replace(z, z=="NA", NA))
我们可以这样做:
do.call(pmax, c(lapply(DF1[-1], function(z) replace(z, grepl("!", z), NA)), list(na.rm = TRUE)))
# [1] "10" "20" "20" "30" NA
### and converting to numbers
do.call(pmax, c(lapply(DF1[-1], function(z) suppressWarnings(as.numeric(replace(z, grepl("!", z), NA)))), list(na.rm = TRUE)))
# [1] 10 20 20 30 NA
结果存储在:
nums <- lapply(DF1[-1], function(z) suppressWarnings(as.numeric(replace(z, grepl("!", z), NA))))
DF1$min <- do.call(pmin, c(nums, na.rm = TRUE))
DF1$max <- do.call(pmax, c(nums, na.rm = TRUE))
DF1
# ID Var1 Var2 Var3 Var4 Var5 min max
# 1 A 1 !5! NA 10 NA 1 10
# 2 B 20 NA NA NA !50! 20 20
# 3 C !20! 10 NA NA 20 10 20
# 4 D NA NA 30 NA NA 30 30
# 5 E !10! NA NA NA NA NA NA
请注意,我们还需要添加 na.rm=FALSE
。
或者,我们也可以像这样使用 readr::parse_number
:
nums <- lapply(DF1[-1], function(z) readr::parse_number(replace(z, grepl("!", z), NA)))
### ... as above
关于r - 如何防止pmax/pmin考虑非数值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76115892/