r - 组织子组字符串(文本)

标签 r string text aggregate

我正在尝试转换类似这样的 df 格式:

df <- data.frame(first = c("a", "a", "b", "b", "b", "c"), 
  words =c("about", "among", "blue", "but", "both", "cat"))

df
  first words
1     a about
2     a among
3     b  blue
4     b   but
5     b  both
6     c   cat

转换成如下格式:

df1
  first           words
1     a    about, among
2     b blue, but, both
3     c             cat
> 

我试过了

aggregate(words ~ first, data = df, FUN = list)

  first   words
1     a    1, 2
2     b 3, 5, 4
3     c       6

tidyverse:

df %>%
  group_by(first) %>% 
  group_rows()

如有任何建议,我们将不胜感激!

最佳答案

一个data.table解决方案:

library(data.table)

df <- data.frame(first = c("a", "a", "b", "b", "b", "c"), 
  words =c("about", "among", "blue", "but", "both", "cat"))

df <- setDT(df)[, lapply(.SD, toString), by = first]

df
#    first           words
# 1:     a    about, among
# 2:     b blue, but, both
# 3:     c             cat

# convert back to a data.frame if you want
setDF(df)

关于r - 组织子组字符串(文本),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58460117/

相关文章:

text - 如何获取包含特定 url 的 <a> 标签中的文本

JavaScript/正则表达式 : Delete a specific Text (word) starting with a specific letter inside a String with words separated by spaces

r - 如何将一个长的连续字符串放入第 n 个字符的部分?

r - 条形图中的渐变填充

python - 作为 REST api 的一部分返回多行字符串

algorithm - 学习文本分析和文本语义从哪里开始?

r - Plotly 的 fillcolor 默认为半透明,想要不透明

r - 如何画一个里面有双色粒子的 Crystal 球

string - bash文件重命名在特定位置添加字符串

python - 在一组字符串中查找子字符串