r - tidyr:将一列分成可变数量的列

我的数据框中有一个变量，其中包含回答问卷中不同问题的长度。数据结构如下:

data <- data.frame(variables = c("q1:2,q2:3,q3:4,q4:10,q5:1",
                                 "q2:3,q1:2,q3:2,q5:2,q4:9",
                                 "q1:1,q2:4,q5:8"))
        separate(variables, sep=",", into=??)

q1:2 表示该受访者需要 2 秒才能回答问题 1 (q1)。

现在，我想使用分隔符“,”separate() 此列。但我不知道“into”参数应该是什么，因为并非所有受访者都回答了相同数量的问题。

目标是拥有一个像这样的数据框(与持续时间无关，只与每个调查问卷中问题的位置有关):

pos_q1 pos_q2 pos_q3 pos_q4 pos_q5
----------------------------------
     1      2      3      4      5
     2      1      3      5      4
     1      2     NA     NA      3

有人可以帮忙吗？谢谢!

最佳答案

您可以先使用separate_rows获取长格式的数据，然后分隔到不同的列中，为每一行创建一个行号列并获取宽格式的数据。

library(dplyr)
library(tidyr)

data %>%
  mutate(id = row_number()) %>%
  separate_rows(variables, sep = ',') %>%
  separate(variables, c('question', 'time'), sep = ':') %>%
  group_by(id) %>%
  mutate(time = row_number()) %>%
  ungroup %>%
  pivot_wider(names_from = question,values_from=time, names_prefix = 'pos_') %>%
  select(-id)

# A tibble: 3 x 5
#  pos_q1 pos_q2 pos_q3 pos_q4 pos_q5
#   <int>  <int>  <int>  <int>  <int>
#1      1      2      3      4      5
#2      2      1      3      5      4
#3      1      2     NA     NA      3

关于r - tidyr:将一列分成可变数量的列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63969719/

r - tidyr:将一列分成可变数量的列

上一篇：javascript - 我们还在 es6+ 中使用 iife

下一篇：html - :has() in jQuery to target the parent table having nested table 的相反是什么