r - 如何使用来自多列的有序项目在 R 数据框中生成新列

标签 r dataframe sorting

我在 R 中有一个如下所示的数据框:

df <-
data.frame(
"first_col" = c("apple", "apple", "banana", "banana", "cacao", "dough"),
"second_col" = c("apple", "apple", "banana", "banana", "apple", "dough"),
"third_col" = c("banana", "apple", "banana", "banana", "banana", "apple"),
stringsAsFactors = FALSE
)

我想生成一个新列,该列使用 base R 对前三列的内容进行排序。

如果我想要它未排序,我可以这样做

df$label <- paste(df$first_col,
                  df$second_col,
                  df$third_col,
                  sep = " - ")

如果我尝试像这样对项目进行排序:

df$label <- paste(sort(df$first_col,
                                     df$second_col,
                                     df$third_col),
                              sep = " - ")

我收到这个错误:

Error in sort(df$first_col, df$second_col, df$third_col) : 
  'decreasing' must be a length-1 logical vector.
Did you intend to set 'partial'?

很明显我做错了什么。查看文档,该方法似乎需要一个向量,所以我尝试将其向量化以执行此操作

df$label <- paste(sort(c(df$first_col,
                                   df$second_col,
                                   df$third_col)),
                              sep = " - ")

但是我得到另一个错误:

Error in `$<-.data.frame`(`*tmp*`, label, value = c("apple",  : 
  replacement has 18 rows, data has 6

看起来它生成了三列,而不仅仅是一列。我做错了什么?

来自如下所示的数据框:

  first_col second_col third_col
1     apple      apple    banana
2     apple      apple     apple
3    banana     banana    banana
4    banana     banana    banana
5     cacao      apple    banana
6     dough      dough     apple

我想获得如下所示的东西:

  first_col second_col third_col                       label
1     apple      apple    banana      apple - apple - banana
2     apple      apple     apple       apple - apple - apple
3    banana     banana    banana    banana - banana - banana
4    banana     banana    banana    banana - banana - banana
5     cacao      apple    banana      apple - banana - cacao
6     dough      dough     apple       apple - dough - dough

您可以通过查看第 5 行和第 6 行来判断排序。

最佳答案

使用基础:

df$combined<-apply(df,1,function(x) paste(sort(x),collapse="-"))
 df
  first_col second_col third_col               combined
1     apple      apple    banana   apple-apple-banana
2     apple      apple     apple    apple-apple-apple
3    banana     banana    banana banana-banana-banana
4    banana     banana    banana banana-banana-banana
5     cacao      apple    banana   apple-banana-cacao
6     dough      dough     apple    apple-dough-dough

仅使用第 1 列和第 2 列:

df$combined<-apply(df[1:2],1,function(x) paste(sort(x),collapse=" - "))
 df
  first_col second_col third_col        combined
1     apple      apple    banana   apple - apple
2     apple      apple     apple   apple - apple
3    banana     banana    banana banana - banana
4    banana     banana    banana banana - banana
5     cacao      apple    banana   apple - cacao
6     dough      dough     apple   dough - dough

数据

df <- structure(list(first_col = c("apple", "apple", "banana", "banana", 
"cacao", "dough"), second_col = c("apple", "apple", "banana", 
"banana", "apple", "dough"), third_col = c("banana", "apple", 
"banana", "banana", "banana", "apple"), sorted = c("apple-apple-banana", 
"apple-apple-apple", "banana-banana-banana", "banana-banana-banana", 
"apple-banana-cacao", "apple-dough-dough")), row.names = c(NA, 
-6L), class = "data.frame")

关于r - 如何使用来自多列的有序项目在 R 数据框中生成新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61824599/

相关文章:

java - 最适合手动排序的数据类型

R 中的 read.table 函数无法读取 'i'

替换整个小标题中的值

r - ggplot 和 pgfSweave 的问题

R:对靠近指定位置的单元格进行采样

jQuery 升序和降序排序(不是表格)

python - 合并两个带有 id 的数据帧

python - 如何对具有非数值的数据框进行分组和透视

scala - Spark 数据帧 : operate on groups

javascript - 使用下划线js对对象数组进行排序