r - 将某一特定列求和为每 2 种和 3 种可能组合中的 n 列

我有一个 240 列和 146 行的数据集。我仅提供数据集中第一个 5 行的 block

DF <- data.frame(
          D1 = c(-0.253, 0.253, -0.951, 0.951, 0.501, -0.501),
          D2 = c(-0.52, -0.52, 0.52, 0.52, -0.172, -0.172),
          D3 = c(0.014, 0.014, 0.014, 0.014, -0.014, -0.014),
          S3 = c(0.095, 0.095, 0.095, 0.095, 0.095, 0.095),
          D1 = c(-0.966, 0.966, -0.647, 0.647, 0.905, -0.905),
          D2 = c(-0.078, -0.078, 0.078, 0.078, -0.943, -0.943),
          D3 = c(-0.046, -0.046, -0.046, -0.046, 0.046, 0.046),
          S3 = c(0.07, 0.07, 0.07, 0.07, 0.07, 0.07)
)

我想将每第 4 列(即 S3)与前 3 列添加为以下组合

D1+S3
D2+S3
D3+S3
D1+D2+S3
D1+D3+S3

现在在新的数据框中，列应该是
D1 D2 D3 S3 D1+S3 D2+S3 D3+S3 D1+D2+S3 D1+D3+S3 D1 D2 D3 S3 D1+S3 D2+S3 D3+S3 D1+D2+S3 D1+D3+S3

如何在 R 中做到这一点？非常感谢在这方面的任何帮助。

最佳答案

在下面的代码中，我 reshape 了数据框的形状，以便将所有值放入 4 列中。为了区分原始列，我添加了一个 ID 列。之后你想做的操作就变得很容易了。

library(tidyverse)

df <- read_table(
"D1         D2     D3      S3      D1       D2      D3    S3
-0.253  -0.520  0.014   0.095   -0.966  -0.078  -0.046  0.070
0.253   -0.520  0.014   0.095   0.966   -0.078  -0.046  0.070
-0.951  0.520   0.014   0.095   -0.647  0.078   -0.046  0.070
0.951   0.520   0.014   0.095   0.647   0.078   -0.046  0.070
0.501   -0.172  -0.014  0.095   0.905   -0.943  0.046   0.070
-0.501  -0.172  -0.014  0.095   -0.905  -0.943  0.046   0.070
")

i <- seq(1, ncol(df)-3, 4)

df_out <- map_dfr(i, ~select(df, seq(., .+3)) %>% set_names(c("D1", "D2", "D3", "S3"))) 

df_out %>% 
  mutate(d1d2s3 = D1 + D2 + D3,
         d1d3s3 = D1 + D3 + D3,
         id = rep(1:length(i), each = nrow(df))) %>% 
  mutate_at(1:3, ~.+S3) %>% 
  bind_cols(df_out, .)

如果之后想将其恢复到原始形状，可以执行以下操作。

df_out %>% 
  group_split(id) %>% 
  bind_cols()

编辑: 我重写了代码，以便适用于可变数量的分解。你应该只需要改变 n_decomp <- 3到适当的数字。它使用 S3 为分解变量的所有可能组合创建变量。因此，随着分解次数的增加，它会迅速升级。

n_decomp <- 3
n_var <- n_decomp + 1
i <- seq(1, ncol(df), n_var)
df_names <- names(df[1:n_var])

df_out <- 
  map_dfr(i,
          ~select(df, seq(., .+n_decomp)) %>%
            set_names(df_names)) %>% 
  mutate(id = rep(1:length(i), each = nrow(df)))


decomp_combn <- map(1:n_decomp, 
    ~combn(df_names[1:n_decomp], .) %>% 
      as_tibble %>% 
      as.list) %>% 
  flatten() %>% 
  map(c, "S3")

decomp_combn %>% 
  map(~select(df_out, .)) %>%
  set_names(map(., ~str_c(names(.), collapse = "_"))) %>% 
  map(~apply(., 1, sum)) %>% 
  as_tibble %>% 
  bind_cols(df_out, .)

关于r - 将某一特定列求和为每 2 种和 3 种可能组合中的 n 列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57606969/

r - 将某一特定列求和为每 2 种和 3 种可能组合中的 n 列

上一篇：png - 如何在 R flexdashboard 中包含 png 文件

下一篇：python - 如何在groupadd cloud-init中创建自定义GID？