r - 如何在 lapply 中引用不属于 SD 的列?

标签 r data.table

我的 data.table 中有一个专栏其中包含我想用来更新一堆其他列的数据。此数据是一个列表,我需要根据将包含在 SD 表达式中的每一列中的值对列表进行子集化

我的数据....

dt <- data.table( A = list( c("X","Y") , c("J","K") ) , B = c(1,2) , C = c(2,1) )
#     A B C
#1: X,Y 1 2
#2: J,K 2 1

我想要的结果......
#     A B C
#1: X,Y X Y
#2: J,K K J

我试过的....
# Column A is not included in SD so not found...
dt[ , lapply( .SD , function(x) A[x] ) , .SDcols = 2:3 ]
#Error in FUN(X[[1L]], ...) : object 'A' not found


# This also does not work. See's all of A as one long vector (look at results for C)
for( i in 2:3 ) dt[ , names(dt)[i] := unlist(A)[ get(names(dt)[i]) ] ]
#     A B C
#1: X,Y X Y
#2: J,K Y X

# I saw this in another answer, but also won't work:
# Basically we add an ID column and use 'by=' to try and solve the problem  above
# Now we get a type mismatch
dt <- data.table( ID = 1:2 , A = list( c("X","Y") , c("J","K") ) , B = c(1,2) , C = c(2,1) , key = "ID" )
for( i in 3:4 ) dt[ , names(dt)[i] := unlist(A)[ get(names(dt)[i]) ] , by = ID ]
#Error in `[.data.table`(dt, , `:=`(names(dt)[i], unlist(A)[get(names(dt)[i])]),  : 
#  Type of RHS ('character') must match LHS ('double'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (e.g. by using 1L instead of 1)

如果有人感兴趣,我的真实数据是一组跨不同分离株的 SNP 和 INDELS,我正在尝试这样做:
# My real data looks more like this:
# In columns V10:V15;
# if '.' in first character then use data from 'Ref' column
# else use integer at first character to subset list in 'Alt' column
#   Contig  Pos V3 Ref Alt    Qual        V10       V11       V12       V13       V14       V15
#1:     1   172  .   T   C 81.0000  1/1:.:.:. ./.:.:.:. ./.:.:.:. ./.:.:.:. ./.:.:.:. ./.:.:.:.
#2:     1   399  .   G C,A 51.0000  ./.:.:.:. 1/1:.:.:. 2/2:.:.:. ./.:.:.:. 1/1:.:.:. ./.:.:.:.
#3:     1   516  .   T   G 57.0000  ./.:.:.:. 1/1:.:.:. ./.:.:.:. 1/1:.:.:. ./.:.:.:. ./.:.:.:.

最佳答案

您可以使用 mapplyset带有 for 循环。可能有更有效的方法

for(j in c('B','C')){
    set(dt, j = j, value = mapply(FUN = '[', dt[['A']],dt[[j]]))
}
 dt
#      A B C
# 1: X,Y X Y
# 2: J,K K J

关于r - 如何在 lapply 中引用不属于 SD 的列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24263170/

相关文章:

r - 使用 data.table 写入和加载 JSON 字符串

r - 在 R 中的 data.table 中创建复合/交互虚拟变量

r - 将列表中的多个 data.tables 组合成两个 data.tables 的列表

r - 通过周数获取一年中的月份数

r - 如何处理 data.table 中的列表列

r - dplyr 按年汇总,包括年数

xml - R XML getNodeset 找不到节点

r - 列值增量更新

r - 在Shiny应用程序中删除DT数据表的行

r - as.raw 和其他 as.* 函数在应用于数组时会删除维度属性