r - 使用向量索引 R 中的 data.frame

我有一个 data.frame，其中包含一个 ID 号和来自调查的缩放响应:

df(responses)

ID    X1    X2    X3    X4
A1    1     1     2     1
B2    0     1     3     0
C3    3     3     2     0

我还有一个用作键的 data.frame:

df(key)

X    Y    Z
2    1    1
3    2    2
4    3    4

我正在尝试编写一个脚本来计算每个参与者的 X、Y 和 Z 分数，其中 X 分数是关键中 X 下列出的问题的回答总和。

例如参与者 A1 的 X 分数将等于 X2、X3 和 X4 的总和在 A1 行 (1+2+1 = 4)。

期望的输出是:

df(output)

ID    X    Y    Z
A1    4    4    3
B2    4    4    1
C3    5    8    6

但是，我目前正在努力使用 key 中的值索引 data.frame responses。我目前的状态是:

#store scale names
scales <- c(colnames(key))
#loop over every participant
for (i in responses$ID){
    #create temporary data.frame with only participant "i"s responses
    data <- subset(responses, ID == i)
    #loop over each scale and store the relevant response numbers
    for (s in scales){
        relevantResponses <- scales[c(s)]
        #create a temporary storage for the total of each scale
        runningScore <- 0
        #index each response and add it to the total
        for (r in relevantResponses){
             runningScore <- runningScore + data[1,r]

但是我得到了错误:

Error in `[.data.frame`(data, 1, r) : 
  undefined columns selected

有没有比嵌套循环更好的索引方式？

最佳答案

我们可以使用 rowSums 和 lapply 循环遍历 key 数据列，根据索引提取 'responses' 数字列，得到rowSums 将 list 转换为 data.frame 和 cbind，第一列为“responses”

cbind(responses[1], data.frame(lapply(key, 
     function(x) rowSums(responses[-1][, na.omit(x)], na.rm = TRUE))))

-输出

#  ID X Y Z
#1 A1 4 4 3
#2 B2 4 4 1
#3 C3 5 8 6

或者使用 tidyverse

imap(key, ~ responses %>%
     transmute(ID, !!.y :=  rowSums(select(cur_data()[-1], na.omit(.x)),
          na.rm = TRUE))) %>% 
     reduce(inner_join)

-输出

#  ID X Y Z
#1 A1 4 4 3
#2 B2 4 4 1
#3 C3 5 8 6

或者另一种选择是mutate with across

key %>%
   mutate(across(everything(), 
       ~ rowSums(responses[-1][na.omit(.)], na.rm = TRUE)), 
          ID = responses$ID, .before = 1)
#  ID X Y Z
#1 A1 4 4 3
#2 B2 4 4 1
#3 C3 5 8 6

数据

responses <- structure(list(ID = c("A1", "B2", "C3"), X1 = c(1L, 0L, 3L), 
    X2 = c(1L, 1L, 3L), X3 = c(2L, 3L, 2L), X4 = c(1L, 0L, 0L
    )), class = "data.frame", row.names = c(NA, -3L))

key <- structure(list(X = 2:4, Y = 1:3, Z = c(1L, 2L, 4L)), class = "data.frame",
   row.names = c(NA, 
-3L))

关于r - 使用向量索引 R 中的 data.frame，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/67755154/

r - 使用向量索引 R 中的 data.frame

数据

上一篇：html - 为什么一个 child 不能有一个 child 的 parent 的高度百分比，但如果它是一个绝对位置，它可以

下一篇：django - 字段不显示在 django 管理中