> require(data.table)
> have <- data.table(ID = c(1,1,1,2,2)
+ , colA = c("A","B","A","A","A")
+ , colB = c("C","A","B","B","C"))
> have
ID colA colB
1: 1 A C
2: 1 B A
3: 1 A B
4: 2 A B
5: 2 A C
> want <- data.table(ID = c(1,2), UnN = c(3,3))
> want
ID UnN
1: 1 3
2: 2 3
我有一个数据表“有”,我想按组“ID”计算多个列“colA”和“colB”中的唯一值。如何实现?
不确定为什么以下内容不起作用:
have[, UnN = uniqueN(c("colA","colB")), by = C("ID")]
最佳答案
删除列名称周围的引号,以便列可以在传递给 uniqueN
函数之前被评估为向量,否则它们被评估为文字字符向量:
have[, .(UnN = uniqueN(c(colA, colB))), ID]
# ID UnN
#1: 1 3
#2: 2 3
关于R data.table 按组计算多列中的唯一值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50321148/