我陷入了一项看似简单的任务中。想象一下以下data.table
:
dt1 <- data.table(ID = as.factor(c("202E", "202E", "202E")),
timestamp = as.POSIXct(c("2017-05-02 00:00:00",
"2017-05-02 00:15:00",
"2017-05-02 00:30:00")),
acceleration_raw = c("-0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.727 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.727 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.164 -0.703 0.656 0.141 -0.703 0.656 0.164 -0.703 0.656 0.141 -0.703 0.656 0.141 -0.703 0.656 0.141 -0.703 0.656 0.141",
"-0.703 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.680 0.680 0.117 -0.703 0.680 0.117 -0.680 0.680 0.117 -0.703 0.680 0.117 -0.680 0.680 0.117 -0.703 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.680 0.680 0.117 -0.680 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117 -0.703 0.680 0.117",
"-0.750 0.586 0.117 -0.773 0.586 0.117 -0.773 0.609 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.750 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.141 -0.773 0.586 0.117 -0.773 0.586 0.141 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.141 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117 -0.773 0.586 0.117"))
创建于 2022 年 11 月 17 日 reprex v2.0.2
我的想法是,我想将 acceleration_raw
列分成 3 个不同的列:acc_x
、acc_y
和 acc_z
. acceleration_raw
的每一行都是一个字符串,最终产生 120 个数字观测值。我想分离 Acceleration_raw,然后以 3 的步长获取第一行和第四行的每个值,并将其放入第二行和第四行的每个值,并将其放入 acc_x
并将其放入 acc_y
,最后将第三行及以后的每个值放入 acc_z
。
我尝试首先将 acceleration_raw
与 separate_rows
与 dplyr
分开:
library('tidyverse')
library('data.table')
dt1 <- dt1 %>%
separate_rows(acceleration_raw, sep = " ", convert = F)
创建于 2022 年 11 月 17 日 reprex v2.0.2
之后:
library('tidyverse')
library('data.table')
dt1 <- dt1 %>%
separate_rows(acceleration_raw, sep = " ", convert = F) %>%
mutate(acc_x = seq(acceleration_raw, from = 1, to = length(dt1), by = 3),
acc_y = seq(acceleration_raw, from = 2, to = length(dt1), by = 3),
acc_z = seq(acceleration_raw, from = 3, to = length(dt1), by = 3))
#> Warning in seq.default(acceleration_raw, from = 1, to = length(dt1), by = 3):
#> first element used of 'length.out' argument
#> Error in `mutate()`:
#> ! Problem while computing `acc_x = seq(acceleration_raw, from = 1, to =
#> length(dt1), by = 3)`.
#> Caused by error in `ceiling()`:
#> ! non-numeric argument to mathematical function
创建于 2022 年 11 月 17 日 reprex v2.0.2
关于如何进行的任何建议?
最佳答案
您可以使用pivot_wider
和unnest
:
library(tidyverse)
dt1 %>%
separate_rows(acceleration_raw, sep = " ", convert = F) %>%
mutate(id = rep(c("acc_x", "acc_y", "acc_z"), times = nrow(.) / 3)) %>%
pivot_wider(names_from = id, values_from = acceleration_raw, values_fn = list) %>%
unnest(cols = c("acc_x", "acc_y", "acc_z"))
这会返回
# A tibble: 120 × 5
ID timestamp acc_x acc_y acc_z
<fct> <dttm> <chr> <chr> <chr>
1 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
2 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
3 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
4 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
5 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
6 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
7 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
8 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
9 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
10 202E 2017-05-02 00:00:00 -0.703 0.656 0.164
# … with 110 more rows
关于R:将行拆分为多行,然后将列拆分为多列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74480834/