r - 如何将一元数据转换为 R 中的二元数据(国家年到成对年)？

我有按国家/地区组织的数据，带有二元关系的 ID。我想按 dyad-year 组织这个。

以下是我的数据的组织方式:

     dyadic_id country_codes year
  1          1           200 1990
  2          1            20 1990
  3          1           200 1991
  4          1            20 1991
  5          2           300 1990
  6          2            10 1990
  7          3           100 1990
  8          3            10 1990
  9          4           500 1991
  10         4           200 1991

以下是我希望我的数据的组织方式:

  dyadic_id_want country_codes_1 country_codes_2 year_want
1              1             200              20      1990
2              1             200              20      1991
3              2             300              10      1990
4              3             100              10      1990
5              4             500             200      1991

这是可重现的代码:

dyadic_id<-c(1,1,1,1,2,2,3,3,4,4)
country_codes<-c(200,20,200,20,300,10,100,10,500,200)
year<-c(1990,1990,1991,1991,1990,1990,1990,1990,1991,1991)
mydf<-as.data.frame(cbind(dyadic_id,country_codes,year))

我希望 mydf 看起来像 df_i_want

dyadic_id_want<-c(1,1,2,3,4)
country_codes_1<-c(200,200,300,100,500)
country_codes_2<-c(20,20,10,10,200)
year_want<-c(1990,1991,1990,1990,1991)
my_df_i_want<-as.data.frame(cbind(dyadic_id_want,country_codes_1,country_codes_2,year_want))

最佳答案

我们可以使用不同的方法将“长” reshape 为“宽”。下面介绍两种。

使用“data.table”，我们将“data.frame”转换为“data.table”(setDT(mydf))，创建一个序列列(“ind”)，按“dyadic_id”和“year”分组。然后，我们使用 dcast 将数据集从“长”格式转换为“宽”格式。 .

library(data.table)
setDT(mydf)[, ind:= 1:.N, by = .(dyadic_id, year)]
dcast(mydf, dyadic_id+year~ paste('country_codes', ind, sep='_'), value.var='country_codes')
#   dyadic_id year country_codes_1 country_codes_2
#1:         1 1990             200              20
#2:         1 1991             200              20
#3:         2 1990             300              10
#4:         3 1990             100              10
#5:         4 1991             500             200

或使用 dplyr/tidyr ，我们做同样的事情，即按 'dyadic_id'、'year' 分组，创建一个 'ind' 列( mutate(... )，并使用 spread来自 tidyr reshape 为“宽”格式。

library(dplyr)
library(tidyr)
mydf %>% 
    group_by(dyadic_id, year) %>%
    mutate(ind= paste0('country_codes', row_number())) %>% 
    spread(ind, country_codes)
#    dyadic_id  year country_codes1 country_codes2
#       (dbl) (dbl)          (dbl)          (dbl)
#1         1  1990            200             20
#2         1  1991            200             20
#3         2  1990            300             10
#4         3  1990            100             10
#5         4  1991            500            200

关于r - 如何将一元数据转换为 R 中的二元数据(国家年到成对年)？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33481645/

r - 如何将一元数据转换为 R 中的二元数据(国家年到成对年)？

上一篇：JSON API 规范 - 服务器职责说明

下一篇：php - 如何让代码不可读？