我在 R 中有一个如下所示的数据框:
df <-
data.frame(
"first_col" = c("apple", "apple", "banana", "banana", "cacao", "dough"),
"second_col" = c("apple", "apple", "banana", "banana", "apple", "dough"),
"third_col" = c("banana", "apple", "banana", "banana", "banana", "apple"),
stringsAsFactors = FALSE
)
我想生成一个新列,该列使用 base R 对前三列的内容进行排序。
如果我想要它未排序,我可以这样做
df$label <- paste(df$first_col,
df$second_col,
df$third_col,
sep = " - ")
如果我尝试像这样对项目进行排序:
df$label <- paste(sort(df$first_col,
df$second_col,
df$third_col),
sep = " - ")
我收到这个错误:
Error in sort(df$first_col, df$second_col, df$third_col) :
'decreasing' must be a length-1 logical vector.
Did you intend to set 'partial'?
很明显我做错了什么。查看文档,该方法似乎需要一个向量,所以我尝试将其向量化以执行此操作
df$label <- paste(sort(c(df$first_col,
df$second_col,
df$third_col)),
sep = " - ")
但是我得到另一个错误:
Error in `$<-.data.frame`(`*tmp*`, label, value = c("apple", :
replacement has 18 rows, data has 6
看起来它生成了三列,而不仅仅是一列。我做错了什么?
来自如下所示的数据框:
first_col second_col third_col
1 apple apple banana
2 apple apple apple
3 banana banana banana
4 banana banana banana
5 cacao apple banana
6 dough dough apple
我想获得如下所示的东西:
first_col second_col third_col label
1 apple apple banana apple - apple - banana
2 apple apple apple apple - apple - apple
3 banana banana banana banana - banana - banana
4 banana banana banana banana - banana - banana
5 cacao apple banana apple - banana - cacao
6 dough dough apple apple - dough - dough
您可以通过查看第 5 行和第 6 行来判断排序。
最佳答案
使用基础
:
df$combined<-apply(df,1,function(x) paste(sort(x),collapse="-"))
df
first_col second_col third_col combined
1 apple apple banana apple-apple-banana
2 apple apple apple apple-apple-apple
3 banana banana banana banana-banana-banana
4 banana banana banana banana-banana-banana
5 cacao apple banana apple-banana-cacao
6 dough dough apple apple-dough-dough
仅使用第 1 列和第 2 列:
df$combined<-apply(df[1:2],1,function(x) paste(sort(x),collapse=" - "))
df
first_col second_col third_col combined
1 apple apple banana apple - apple
2 apple apple apple apple - apple
3 banana banana banana banana - banana
4 banana banana banana banana - banana
5 cacao apple banana apple - cacao
6 dough dough apple dough - dough
数据
df <- structure(list(first_col = c("apple", "apple", "banana", "banana",
"cacao", "dough"), second_col = c("apple", "apple", "banana",
"banana", "apple", "dough"), third_col = c("banana", "apple",
"banana", "banana", "banana", "apple"), sorted = c("apple-apple-banana",
"apple-apple-apple", "banana-banana-banana", "banana-banana-banana",
"apple-banana-cacao", "apple-dough-dough")), row.names = c(NA,
-6L), class = "data.frame")
关于r - 如何使用来自多列的有序项目在 R 数据框中生成新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61824599/