r - R中所有数字的中位数和所有字符的模式

使用像原始这样的数据集:

id <- c("JF", "GH", "GH", "ANN", "GH", "ROG", "JF")
group <- c("most", "least", "most", "least", "least", "most", "least")
NP <- c(4,6,18,1,3,12,8)
iso_USA <- c(1, 0, 0, 0, 0, 1, 1)
iso_CHN <- c(0, 1, 1, 0, 0, 0, 0)
color <- c("blue", "orange", "blue", "blue", "red", "orange", "black")

original <- data.frame(id, group, NP, iso_USA, iso_CHN, color)


numeric <- unlist(lapply(original, is.numeric))  
numeric <- names(original[ , numeric])

char <- unlist(lapply(original, is.character))  
char <- names(original[ , char])
char <- char[-1]   #remove id from variables of interest

我想按“组”分组并计算数值变量的中位数和字符变量的众数。因此，数据看起来像 original2。请注意，我的实际数据集的列数比此处显示的模拟版本多得多:

group <- c("least", "most")
NP <- c(6,12)
iso_USA <- c(0,1)
iso_CHN <- c(0, 0)
color <- c("orange", "blue")

original2 <- data.frame(group, NP, iso_USA, iso_CHN, color)

有什么线索吗？

最佳答案

使用 dplyr 的 across 功能和接受的答案 at the FAQ about implementing a mode function :

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

library(dplyr)
original %>%
  select(-id) %>%
  group_by(group) %>%
  summarize(
    across(where(is.numeric), median),
    across(where(is.character), Mode)
  )
# # A tibble: 2 × 6
#   group    NP iso_USA iso_CHN color 
#   <chr> <dbl>   <dbl>   <dbl> <chr> 
# 1 least   4.5       0       0 orange
# 2 most   12         1       0 blue

关于r - R中所有数字的中位数和所有字符的模式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/70189952/

r - R中所有数字的中位数和所有字符的模式

上一篇：amazon-web-services - 为什么通过 IAM 策略将 `ssm:sendCommand` 限制为特定文档显示拒绝访问？

下一篇：unit-testing - 是否有 Kotest 断言来测试列表是否包含具有给定属性的元素？