r - 如何最好地在 R 的数据框中将一个因素的不同级别相互划分？

我试图将 R 中一个因子的两个不同级别的值分开，但我无法找出最好的方法。我有以下示例数据框:

structure(list(road_type_MEC = structure(c(1L, 1L, 1L, 2L, 2L, 
2L, 3L, 3L, 3L), .Label = c("A Roads", "Motorways", "Other Roads"
), class = "factor"), section = structure(c(2L, 3L, 5L, 2L, 3L, 
5L, 2L, 3L, 5L), .Label = c("CO2_forecasts", "NOX_forecasts", 
"PM10_forecasts", "speedvehkm", "traffic_forecasts"), class = "factor"), 
    Total = c(126976204275.87, 4488849757.15535, 28318632014.3604, 
    75124228527.6742, 2799787906.95208, 43699868192.4562, 96766663214.7388, 
    3181356853.12977, 2094202918.63916)), row.names = c(NA, -9L
), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), na.action = structure(c(`8933` = 8933L, 
`8934` = 8934L, `8935` = 8935L, `8957` = 8957L, `8958` = 8958L, 
`8959` = 8959L, `8981` = 8981L, `8982` = 8982L, `8983` = 8983L, 
`8999` = 8999L, `9000` = 9000L, `9001` = 9001L, `9005` = 9005L, 
`9006` = 9006L, `9007` = 9007L, `9023` = 9023L, `9024` = 9024L, 
`9025` = 9025L, `9029` = 9029L, `9030` = 9030L, `9031` = 9031L, 
`9191` = 9191L, `9192` = 9192L, `9193` = 9193L, `9524` = 9524L, 
`9525` = 9525L, `9526` = 9526L, `9527` = 9527L, `9528` = 9528L, 
`9529` = 9529L, `9530` = 9530L, `9531` = 9531L, `9532` = 9532L, 
`9533` = 9533L, `9534` = 9534L, `9535` = 9535L, `9536` = 9536L, 
`9537` = 9537L, `9538` = 9538L, `9539` = 9539L, `9540` = 9540L, 
`9541` = 9541L, `9548` = 9548L, `9549` = 9549L, `9550` = 9550L, 
`9551` = 9551L, `9552` = 9552L, `9553` = 9553L, `9554` = 9554L, 
`9555` = 9555L, `9556` = 9556L, `9557` = 9557L, `9558` = 9558L, 
`9559` = 9559L, `9560` = 9560L, `9561` = 9561L, `9562` = 9562L, 
`9563` = 9563L, `9564` = 9564L, `9565` = 9565L, `9569` = 9569L, 
`9570` = 9570L, `9571` = 9571L, `9572` = 9572L, `9573` = 9573L, 
`9574` = 9574L, `9575` = 9575L, `9576` = 9576L, `9577` = 9577L, 
`9578` = 9578L, `9579` = 9579L, `9580` = 9580L, `9581` = 9581L, 
`9582` = 9582L, `9583` = 9583L, `9584` = 9584L, `9585` = 9585L, 
`9586` = 9586L, `9587` = 9587L, `9588` = 9588L, `9589` = 9589L, 
`9593` = 9593L, `9594` = 9594L, `9595` = 9595L, `9596` = 9596L, 
`9597` = 9597L, `9598` = 9598L, `9599` = 9599L, `9600` = 9600L, 
`9601` = 9601L, `9602` = 9602L, `9603` = 9603L, `9604` = 9604L, 
`9605` = 9605L, `9606` = 9606L, `9607` = 9607L, `9608` = 9608L, 
`9609` = 9609L, `9610` = 9610L, `9611` = 9611L, `9612` = 9612L, 
`9613` = 9613L, `9617` = 9617L, `9618` = 9618L, `9619` = 9619L, 
`9620` = 9620L, `9621` = 9621L, `9622` = 9622L, `9623` = 9623L, 
`9624` = 9624L, `9625` = 9625L, `9626` = 9626L, `9627` = 9627L, 
`9628` = 9628L, `9629` = 9629L, `9630` = 9630L, `9631` = 9631L, 
`9632` = 9632L, `9633` = 9633L, `9634` = 9634L, `9635` = 9635L, 
`9636` = 9636L, `9637` = 9637L, `9653` = 9653L, `9654` = 9654L, 
`9655` = 9655L, `9701` = 9701L, `9702` = 9702L, `9703` = 9703L, 
`9725` = 9725L, `9726` = 9726L, `9727` = 9727L, `9749` = 9749L, 
`9750` = 9750L, `9751` = 9751L), class = "omit"), groups = structure(list(
    road_type_MEC = structure(1:3, .Label = c("A Roads", "Motorways", 
    "Other Roads"), class = "factor"), .rows = list(1:3, 4:6, 
        7:9)), row.names = c(NA, -3L), class = c("tbl_df", "tbl", 
   "data.frame"), .drop = TRUE))

我正在尝试计算数据框中每个 road_type_MEC(“A 道路”、“高速公路”、“其他道路”)的section$NOX_forecasts/section$traffic_forecasts。我不确定继续的最佳方法 - 任何帮助将不胜感激。

最佳答案

考虑将长整形为宽整形，然后计算跨列的除法:

rdf <- reshape(section, timevar = "section", v.names = "Total", 
               idvar = "road_type_MEC", direction = "wide")
colnames(rdf) <- gsub("Total.", "", colnames(rdf))

rdf$div_result <- rdf$NOX_forecasts/rdf$traffic_forecasts

rdf
#   road_type_MEC NOX_forecasts PM10_forecasts traffic_forecasts div_result
# 1       A Roads  126976204276     4488849757       28318632014   4.483840
# 4     Motorways   75124228528     2799787907       43699868192   1.719095
# 7   Other Roads   96766663215     3181356853        2094202919  46.206918

如果您需要将数据格式化回长整型，请再次使用reshape:

long_df <- reshape(rdf, varying = names(rdf)[-1], times = names(rdf)[-1], 
                   timevar = "section", v.names = "Total",
                   new.row.names = 1:1E4, direction = "long")

long_df <- data.frame(long_df[order(long_df$road_type_MEC),],
                      row.names = NULL)
long_df

#    road_type_MEC           section        Total id
# 1        A Roads     NOX_forecasts 1.269762e+11  1
# 2        A Roads    PM10_forecasts 4.488850e+09  1
# 3        A Roads traffic_forecasts 2.831863e+10  1
# 4        A Roads        div_result 4.483840e+00  1
# 5      Motorways     NOX_forecasts 7.512423e+10  2
# 6      Motorways    PM10_forecasts 2.799788e+09  2
# 7      Motorways traffic_forecasts 4.369987e+10  2
# 8      Motorways        div_result 1.719095e+00  2
# 9    Other Roads     NOX_forecasts 9.676666e+10  3
# 10   Other Roads    PM10_forecasts 3.181357e+09  3
# 11   Other Roads traffic_forecasts 2.094203e+09  3
# 12   Other Roads        div_result 4.620692e+01  3

关于r - 如何最好地在 R 的数据框中将一个因素的不同级别相互划分？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57528612/

r - 如何最好地在 R 的数据框中将一个因素的不同级别相互划分？

上一篇：node.js - 使用 Lambda 代理的 API 网关设置 Cookie

下一篇：python-3.x - Snakemake 在干运行时产生严重不连贯的错误