有没有办法使用pivot_longer
和 pivot_wider
在变量的子集上?这是一个例子。首先,我将创建一个具有所需起始结构的数据框。
library(tidyverse)
# Assume this as starting df
arrests <- USArrests %>%
as_tibble(rownames = "State") %>%
pivot_longer(-State, names_to = "Crime", values_to = "Value") %>%
group_by(State) %>%
mutate(Total = sum(Value)) %>%
ungroup()
arrests
# A tibble: 200 x 4
State Crime Value Total
<chr> <chr> <dbl> <dbl>
1 Alabama Murder 13.2 328.
2 Alabama Assault 236 328.
3 Alabama UrbanPop 58 328.
4 Alabama Rape 21.2 328.
5 Alaska Murder 10 366.
6 Alaska Assault 263 366.
7 Alaska UrbanPop 48 366.
8 Alaska Rape 44.5 366.
9 Arizona Murder 8.1 413.
10 Arizona Assault 294 413.
# ... with 190 more rows
所以我们使用
arrest
数据框。现在我想把“Total”折叠成“Crime”,这样“Total”就是Crime中的一个值,就像“Murder”一样。我也想反过来。 “Total”折叠成“Crime”后,我想用
pivot_wider
关于“犯罪”,但仅适用于 Crime == "Total"
的值.这些行动可行吗?
最佳答案
一种选择是 add_row
.在按“状态”进行分组后,循环遍历 list
与 map
并添加一行( add_row
from tibble
)与“Total”列的第一个值并删除“Total”列
library(dplyr)
library(purrr)
library(tibble)
arrests2 <- arrests %>%
group_split(State) %>%
map_dfr(~ .x %>%
add_row(State = .$State[1], Crime = 'Total',
Value = .$Total[1]) %>%
select(-Total))
arrests2
# A tibble: 250 x 3
# State Crime Value
# * <chr> <chr> <dbl>
# 1 Alabama Murder 13.2
# 2 Alabama Assault 236
# 3 Alabama UrbanPop 58
# 4 Alabama Rape 21.2
# 5 Alabama Total 328.
# 6 Alaska Murder 10
# 7 Alaska Assault 263
# 8 Alaska UrbanPop 48
# 9 Alaska Rape 44.5
#10 Alaska Total 366.
# … with 240 more rows
或者另一种选择是
summarise
使用“总计”值,然后执行 bind_rows
arrests %>%
group_by(State) %>%
summarise(Crime = 'Total', Value = first(Total)) %>%
bind_rows(arrests %>% select(-Total), .) %>%
arrange(State)
或使用
pivot_longer
library(tidyr)
arrests %>%
pivot_longer(cols = Value:Total) %>%
mutate(Crime = replace(Crime, name == 'Total', 'Total')) %>%
select(-name) %>%
distinct()
# A tibble: 250 x 3
# State Crime value
# <chr> <chr> <dbl>
# 1 Alabama Murder 13.2
# 2 Alabama Total 328.
# 3 Alabama Assault 236
# 4 Alabama UrbanPop 58
# 5 Alabama Rape 21.2
# 6 Alaska Murder 10
# 7 Alaska Total 366.
# 8 Alaska Assault 263
# 9 Alaska UrbanPop 48
#10 Alaska Rape 44.5
# … with 240 more rows
如果我们需要做相反的事情,然后按'State'分组,通过提取与'Crime'对应的'Value'作为'Total'来创建'Total'列,然后
filter
出犯罪为“总”的那一行arrests2 %>%
group_by(State) %>%
mutate(Total = Value[Crime == 'Total']) %>%
filter(Crime != 'Total')
# A tibble: 200 x 4
# Groups: State [50]
# State Crime Value Total
# <chr> <chr> <dbl> <dbl>
# 1 Alabama Murder 13.2 328.
# 2 Alabama Assault 236 328.
# 3 Alabama UrbanPop 58 328.
# 4 Alabama Rape 21.2 328.
# 5 Alaska Murder 10 366.
# 6 Alaska Assault 263 366.
# 7 Alaska UrbanPop 48 366.
# 8 Alaska Rape 44.5 366.
# 9 Arizona Murder 8.1 413.
#10 Arizona Assault 294 413.
# … with 190 more rows
关于r - dplyr 在变量子集上使用 pivot_longer 和 pivot_wider,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60895646/