r - 带有列对的 pivot_longer

标签 r dataframe pivot-table tidyr data-manipulation

我再次努力使用 pivot_longer 将宽 df 转换为长 df 数据框是针对不同效果大小和样本大小进行功效分析的结果,这就是原始 df 的样子:

  es_issue_owner es_independence es_party pwr_issue_owner_1200 pwr_independence_1200 pwr_party_1200 pwr_issue_owner_2400 pwr_independence_2400 pwr_party_2400
1            0.1             0.1      0.1                0.087                 0.080          0.081                0.130                 0.163          0.102
2            0.2             0.2      0.2                0.235                 0.273          0.157                0.406                 0.513          0.267

或者用 dput:

example <- structure(list(es_issue_owner = c(0.1, 0.2), es_independence = c(0.1, 
0.2), es_party = c(0.1, 0.2), pwr_issue_owner_1200 = c(0.087, 
0.235), pwr_independence_1200 = c(0.08, 0.273), pwr_party_1200 = c(0.081, 
0.157), pwr_issue_owner_2400 = c(0.13, 0.406), pwr_independence_2400 = c(0.163, 
0.513), pwr_party_2400 = c(0.102, 0.267)), row.names = 1:2, class = "data.frame")

三种测量(“独立性”、“问题所有者”、“政党”)的每个效应大小 (es) 与 1200 和 2400 样本大小的功效计算配对。根据上面的示例,这就是我想要获得的输出:

           type  es  pwr value
1  independence 0.1 1200 0.080
2   issue_owner 0.1 1200 0.087
3         party 0.1 1200 0.081
4  independence 0.2 1200 0.273
5   issue_owner 0.2 1200 0.235
6         party 0.2 1200 0.157
7  independence 0.1 2400 0.163
8   issue_owner 0.1 2400 0.130
9         party 0.1 2400 0.102
10 independence 0.2 2400 0.513
11  issue_owner 0.2 2400 0.406
12        party 0.2 2400 0.267

或者,使用 dput:

output <- structure(list(type = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L), .Label = c("independence", "issueowner", 
"party"), class = "factor"), es = c(0.1, 0.1, 0.1, 0.2, 0.2, 
0.2, 0.1, 0.1, 0.1, 0.2, 0.2, 0.2), pwr = c(1200, 1200, 1200, 
1200, 1200, 1200, 2400, 2400, 2400, 2400, 2400, 2400), value = c("0.080", 
"0.087", "0.081", "0.273", "0.235", "0.157", "0.163", "0.130", 
"0.102", "0.513", "0.406", "0.267")), out.attrs = list(dim = c(type = 3L, 
es = 2L, pwr = 2L, value = 1L), dimnames = list(type = c("type=independence", 
"type=issueowner", "type=party"), es = c("es=0.1", "es=0.2"), 
    pwr = c("pwr=1200", "pwr=2400"), value = "value=NA")), class = "data.frame", row.names = c(NA, 
-12L))

作为开始,我试着用这个做实验:

example %>% 
  pivot_longer(cols = everything(),
               names_pattern = "(es_[A-Za-z]+)(pwr_[A-Za-z]+_1200)(pwr_[A-Za-z]+_2400)",
               # names_sep = "(?=\\d)_(?=\\d)",
               names_to = c("es", "pwr_1200", "pwr_2400"),
               values_to = "value")

但它没有用,所以我尝试了两个步骤,哪种方法可行,但“配对”搞砸了:

  example %>% 
  # pivot_longer(cols = everything(),
  #              names_pattern = "(es_[A-Za-z]+)(pwr_[A-Za-z]+_1200)(pwr_[A-Za-z]+_2400)",
  #              # names_sep = "(?=\\d)_(?=\\d)",
  #              names_to = c("es", "pwr_1200", "pwr_2400"),
  #              values_to = "value")
  pivot_longer(cols = contains("pwr_"),
               # names_pattern = "es_pwr(.*)1200_pwr(.*)2400",
               names_sep = "_(?=\\d)",
               names_to = c("pwr_type", "pwr_sample"), values_to = "value") %>%
  pivot_longer(cols = contains("es_"),
               # names_pattern = "es_pwr(.*)1200_pwr(.*)2400",
               # names_sep = "_(?=\\d)",
               names_to = "es_type", values_to = "es")

如有任何帮助,我将不胜感激!

最佳答案

library(tidyverse)

example %>% 
  pivot_longer(cols = starts_with("es"), names_to = "type", names_prefix = "es_", values_to = "es") %>%
  pivot_longer(cols = starts_with("pwr"), names_to = "pwr", names_prefix = "pwr_") %>% 
  filter(substr(type, 1, 3) == substr(pwr, 1, 3)) %>% 
  mutate(pwr = parse_number(pwr)) %>% 
  arrange(pwr, es, type)

输出

   type            es   pwr value
 1 independence   0.1  1200 0.08 
 2 issue_owner    0.1  1200 0.087
 3 party          0.1  1200 0.081
 4 independence   0.2  1200 0.273
 5 issue_owner    0.2  1200 0.235
 6 party          0.2  1200 0.157
 7 independence   0.1  2400 0.163
 8 issue_owner    0.1  2400 0.13 
 9 party          0.1  2400 0.102
10 independence   0.2  2400 0.513
11 issue_owner    0.2  2400 0.406
12 party          0.2  2400 0.267

关于r - 带有列对的 pivot_longer,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70969176/

相关文章:

python - Pandas 数据框左合并而不重新索引

pandas - Pandas 更改数据透视表中列的顺序

mySQL 数据透视动态日期表

r - ggplot2 密度直方图,宽度 =.5,vline 和中心条位置

r - igraph --- 找到最短路径,包括转弯时的重量

python - 将数据从 MongoDB 游标加载到 pandas Dataframe 的更快方法

python - 查找与 pandas DataFrame 中的值最接近的第一行的索引

matlab - 如何从表格在matlab中制作数据透视表

r - 如何从R中数据框的前n行中删除条件下的行

r - 将 R 向量中的分数转换为小数