我想使用 dplyr 跨列的子集逐行计算某些文本(或因子级别)的实例数。
这是输入:
> input_df
num_col_1 num_col_2 text_col_1 text_col_2
1 1 4 yes yes
2 2 5 no yes
3 3 6 no <NA>
这是所需的输出:
> output_df
num_col_1 num_col_2 text_col_1 text_col_2 sum_yes
1 1 4 yes yes 2
2 2 5 no yes 1
3 3 6 no <NA> 0
在
sum_yes
我们已经计算了该行中"is"的数量。我尝试了两种方法:
尝试的解决方案1:
text_cols = c("text_col_1","text_col_2")
df = input_df %>% mutate(sum_yes = rowSums( select(text_cols) == "yes" ), na.rm = TRUE)
错误:
Error in mutate_impl(.data, dots) :
Evaluation error: no applicable method for 'select_' applied to an object of class "character".
尝试的解决方案2:
text_cols = c("text_col_1","text_col_2")
df = input_df %>% select(text_cols) %>% rowsum("yes", na.rm = TRUE)
错误:
Error in rowsum.data.frame(., "yes", na.rm = TRUE) :
incorrect length for 'group'
最佳答案
mutate
并为每行计算"is"的数量总和。 library(dplyr)
df %>% mutate(sum_yes = rowSums(.[text_cols] == "yes"))
# num_col_1 num_col_2 text_col_1 text_col_2 sum_yes
#* <int> <int> <fct> <fct> <int>
#1 1 4 yes yes 2
#2 2 5 no yes 1
#3 3 6 no <NA> 0
灵感来自 this回答。rowwise
与 c_across
:df %>%
rowwise() %>%
mutate(sum_yes = sum(c_across(all_of(text_cols)) == "yes"))
do
与 rowwise
df %>%
rowwise() %>%
do((.) %>% as.data.frame %>%
mutate(sum_yes = sum(.=="yes")))
do
和 rowwise
df %>%
select(text_cols) %>%
mutate(sum_yes = rowSums(. == "yes"))
df$sum_yes <- rowSums(df[text_cols] == "yes")
关于r - 计算 dplyr 中列子集中的行计数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51783095/