假设我有以下“关键”数据集:
key = read.table(text = "question r_answer d_answer
20 A B
21 B A
22 A B
23 B A
24 A B
25 B A", header = T)
> key
question r_answer d_answer
1 20 A B
2 21 B A
3 22 A B
4 23 B A
5 24 A B
6 25 B A
这告诉我,对于给定的问题,“R”会给出什么答案,“D”会给出什么答案。
现在让我们说这是数据集:
data = read.table(text = "person_id question answer
1 20 A
1 21 B
1 22 A
1 23 B
1 24 A
1 25 B
2 20 A
2 21 A
2 23 A
2 24 B
2 25 B", header = T)
>data
person_id question answer
1 1 20 A
2 1 21 B
3 1 22 A
4 1 23 B
5 1 24 A
6 1 25 B
7 2 20 A
8 2 21 A
9 2 23 A
10 2 24 B
11 2 25 B
这告诉我,对于给定的人,他们的实际答案是什么。我想创建一个
answer_type
数据中的列等于 r_answer
或 d_answer
取决于键中列出的值。结果输出将是: person_id question answer answer_type
1 1 20 A r_answer
2 1 21 B r_answer
3 1 22 A r_answer
4 1 23 B r_answer
5 1 24 A r_answer
6 1 25 B r_answer
7 2 20 A r_answer
8 2 21 A d_answer
9 2 23 A d_answer
10 2 24 B d_answer
11 2 25 B r_answer
我有一种感觉,答案将涉及从 dplyr 合并,但我不太清楚。
最佳答案
一 dplyr
和 tidyr
选项可以是:
data %>%
left_join(key %>%
pivot_longer(-question, names_to = "answer_type"), by = c("question" = "question",
"answer" = "value"))
person_id question answer answer_type
1 1 20 A r_answer
2 1 21 B r_answer
3 1 22 A r_answer
4 1 23 B r_answer
5 1 24 A r_answer
6 1 25 B r_answer
7 2 20 A r_answer
8 2 21 A d_answer
9 2 23 A d_answer
10 2 24 B d_answer
11 2 25 B r_answer
关于r - 根据R中的查找表填充列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59903278/