r - 当列没有通用类型时如何延长数据集的旋转时间

标签 r dataframe dplyr pivot tidyr

如何使第一个数据集看起来像第二个数据集?

我正在尝试使用 dplyrivot_longer 函数,但没有任何运气。下面是一个示例数据集,它反射(reflect)了包含 50 多列的更大的数据集。

Reshaping data.frame from wide to long format


library(tidyverse)

df1
#> # A tibble: 9 x 6
#>   column_label  val1  val2  cat1  cat2 cat3             
#>          <dbl> <dbl> <dbl> <dbl> <dbl> <chr>            
#> 1            2 0.989  9.89     0    NA <NA>             
#> 2            2 0.622  6.22     1    NA <NA>             
#> 3            3 0.619  6.19    NA     0 <NA>             
#> 4            3 0.119  1.19    NA     1 <NA>             
#> 5           10 0.407  4.07    NA    NA BABY BOOMERS     
#> 6           10 0.800  8.00    NA    NA GEN Z            
#> 7           10 0.305  3.05    NA    NA GENERATION X     
#> 8           10 0.158  1.58    NA    NA MILLENNIALS      
#> 9           10 0.439  4.39    NA    NA SILENT GENERATION

# how do you pivot_longer to create this data set?

df2
#> # A tibble: 9 x 5
#>   column_label  val1  val2 variables Values           
#>          <dbl> <dbl> <dbl> <chr>     <chr>            
#> 1            2 0.989  9.89 cat1      0                
#> 2            2 0.622  6.22 cat1      1                
#> 3            3 0.619  6.19 cat2      0                
#> 4            3 0.119  1.19 cat2      1                
#> 5           10 0.407  4.07 cat3      BABY BOOMERS     
#> 6           10 0.800  8.00 cat3      GEN Z            
#> 7           10 0.305  3.05 cat3      GENERATION X     
#> 8           10 0.158  1.58 cat3      MILLENNIALS      
#> 9           10 0.439  4.39 cat3      SILENT GENERATION

数据

df1 <- structure(list(column_label = c(2, 2, 3, 3, 10, 10, 10, 10, 10
), val1 = c(0.989049526, 0.622384581, 0.618576065, 0.11864823, 
            0.406763475, 0.799564365, 0.3053153, 0.158456912, 0.438528606
), val2 = c(9.890495264, 6.223845807, 6.185760647, 1.186482297, 
            4.067634747, 7.995643651, 3.053153001, 1.584569123, 4.385286057
), cat1 = c(0, 1, NA, NA, NA, NA, NA, NA, NA), cat2 = c(NA, NA, 
                                                        0, 1, NA, NA, NA, NA, NA), cat3 = c(NA, NA, NA, NA, "BABY BOOMERS", 
                                                                                            "GEN Z", "GENERATION X", "MILLENNIALS", "SILENT GENERATION")), class = c("spec_tbl_df", 
                                                                                                                                                                     "tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L), spec = structure(list(
                                                                                                                                                                       cols = list(column_label = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                              "collector")), val1 = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                "collector")), val2 = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                  "collector")), cat1 = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                    "collector")), cat2 = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                      "collector")), cat3 = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        "collector"))), default = structure(list(), class = c("collector_guess", 
collector")), skip = 1), class = "col_spec"))

df2 <- structure(list(column_label = c(2, 2, 3, 3, 10, 10, 10, 10, 10
), val1 = c(0.989049526, 0.622384581, 0.618576065, 0.11864823, 
            0.406763475, 0.799564365, 0.3053153, 0.158456912, 0.438528606
), val2 = c(9.890495264, 6.223845807, 6.185760647, 1.186482297, 
            4.067634747, 7.995643651, 3.053153001, 1.584569123, 4.385286057
), variables = c("cat1", "cat1", "cat2", "cat2", "cat3", "cat3", 
                 "cat3", "cat3", "cat3"), Values = c("0", "1", "0", "1", "BABY BOOMERS", 
                                                     "GEN Z", "GENERATION X", "MILLENNIALS", "SILENT GENERATION")), class = c("spec_tbl_df", 
                                                                                                                              "tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L), spec = structure(list(
                                                                                                                                cols = list(column_label = structure(list(), class = c("collector_double", 
                                                                                                                                                                                       "collector")), val1 = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                         "collector")), val2 = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                           "collector")), variables = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                                                                  "collector")), Values = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                                                                                                                      "collector"))), default = structure(list(), class = c("collector_guess", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                            "collector")), skip = 1), class = "col_spec"))

pivot_longer with multiple classes causes error ("No common type") (v0.3.0) 于 2020 年 3 月 13 日创建

最佳答案

您在pivot_longer中选择的列没有通用类型,即cat1cat2是数字,cat3是性格。您可以提前将它们全部转换为字符,或使用参数 values_ptypes 指定类型。

df1 %>%
  pivot_longer(cat1:cat3,
               names_to = 'variables', values_to = 'Values',
               values_drop_na = TRUE,
               values_ptypes = list(Values = character()))

# # A tibble: 9 x 5
#   column_label  val1  val2 variables Values           
#          <dbl> <dbl> <dbl> <chr>     <chr>            
# 1            2 0.989  9.89 cat1      0                
# 2            2 0.622  6.22 cat1      1                
# 3            3 0.619  6.19 cat2      0                
# 4            3 0.119  1.19 cat2      1                
# 5           10 0.407  4.07 cat3      BABY BOOMERS     
# 6           10 0.800  8.00 cat3      GEN Z            
# 7           10 0.305  3.05 cat3      GENERATION X     
# 8           10 0.158  1.58 cat3      MILLENNIALS      
# 9           10 0.439  4.39 cat3      SILENT GENERATION

关于r - 当列没有通用类型时如何延长数据集的旋转时间,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60676491/

相关文章:

c# - 使用 R.NET 从 c# 控制台应用程序中的 r 源文件调用函数

xml - 从网站提取足球比分

Python Pandas 比较两个数据帧以将国家/地区分配给电话号码

r - 按组选择每次运行零之前的最后一个非零值

r - 计算两个词的共同出现,但顺序在 r 中并不重要

r - 如何从 R 中的多个 id 中获取独占和总计数

r - 一个或多个多边形覆盖的栅格单元的一部分 : is there a faster way to do this (in R)?

python - 如何将分隔值转换为one-hot编码列?

r - 基于日期范围的不同行

r - 跨列和条件按行求和