r - 带有 ggplot2 的堆叠蝴蝶/ Tornado 图,以零为中心,用于可视化前后李克特问卷

标签 r ggplot2 facet likert stacked-bar-chart

我有来自教育模块之前和之后的问卷调查的数据。我正在尝试使用 ggplot 在堆叠条形图中可视化数据,更具体地说是堆叠蝴蝶/ Tornado 图,我想让两个不同的场合(= 1 之前,= 2 之后)充当水平面。我已经手绘了我想要的图片,还有我目前得到的以及我的ggplot代码。

我有什么:

enter image description here

我需要什么:

enter image description here

我的代码:

likert_viz <- ggplot(all.gg.data, aes(x = number, y = item, fill = opinion)) +
  geom_col(width = 5.0) +
  geom_vline(xintercept = 50, color = c("white")) +
  scale_y_discrete(limits = all.gg.data$item, labels = scales::wrap_format(50)) +
  scale_x_continuous(labels = scales::percent) +
  labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
  facet_wrap(~occasion)

我的数据:(https://pastebin.com/CdgSseKJ)

我尝试过 facet,希望能够按照我想要的方式重新排列它们,但未能成功排列它们,以便一次一个问题的“之前”答案堆叠在“之后”答案之上。希望得到有用的建议或解决方案。

最佳答案

这相当棘手,我找到了两个解决方案:为正面意见创建一个面,为负面意见创建一个面,并删除面之间的空格;明确定义每个柱的开始和结束并将其传递给 geom_segment 而不是 geom_col

公共(public)部分

独立于您喜欢的方法,有很多共同点:

您问的第一件事是将不同场合的酒吧堆叠在一起。这可以通过在 geom_... 中添加 position = "dodge""dodge2" 来完成;或者通过执行 paste(item, "- Occasion", occasion) 创建一个新的 y 轴。第一个选项更优雅,但我无法让它工作。

第一步:创建新的 y 轴,并将 number 转换为频率(以便 x 轴从 -100% 变为 100%):

all.gg.data2 = df %>%
  mutate(item2 = paste(item, "- Occasion", occasion)) %>%
  group_by(item, occasion) %>%
  mutate(number = number / sum(number))

第二步:因为我们希望 N 以零为中心,所以将该意见分成两半,即“积极部分”和“消极部分”

all.gg.data.N = all.gg.data2 %>%
  filter(opinion == "N") %>%
  {rbind(mutate(., number = number/2, opinion = "N-"),
         mutate(., number =  number/2, opinion = "N+"))}

第三步:添加新的 N 行,创建 scale 变量,并根据我们想要的方法转换 number 列。 第四步:绘图。现在我按方法来解释。

Obs:两者都使用自定义函数(基于 scales::wrap_format)从 y 轴移除额外的文本:

my_wrap_format = function(x) {
  x[seq(2, length(x), 2)] = gsub(".+(- Occasion [1-2])", "\\1", x[seq(2, length(x), 2)])
  
  unlist(lapply(strwrap(x, width = 50, simplify = FALSE), 
                paste0, collapse = "\n"))}

分面geom_col 方法

第三步:为负数部分更改number的符号。

all.gg.data = all.gg.data2 %>%
  filter(opinion != "N") %>%
  rbind(all.gg.data.N) %>%
  mutate(opinion = factor(opinion, levels = c("SD", "D", "N-", "SA", "A", "N+")),
         scale = ifelse(opinion %in% c("SD", "D", "N-"), "-", "+"),
         number = ifelse(scale == "-", -number, number))

第四步:根据scale创建facet(facet_wrap(vars(scale))),去除x轴多余空间( expand = c(0, 0)panel.spacing = unit(0, "cm")), 并用 连接 "N+"和 "N-">scale_fill_manual

ggplot(all.gg.data, aes(x = number, y = item2, fill = opinion)) +
  geom_col() +
  facet_wrap(vars(scale), ncol = 2, scales = "free_x") +
  scale_y_discrete(labels = my_wrap_format) +
  scale_x_continuous(labels = scales::percent, expand = c(0, 0)) +
  labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
  scale_fill_manual(values = c(SD = "darkblue", D = "blue", `N-` = "grey",
                                `N+` = "grey", A = "red", SA = "darkred"),
                     labels = c("SD", "D", "N", "A", "SA"),
                     breaks = c("SD", "D", "N-", "A", "SA")) +
  theme(panel.spacing = unit(0, "cm"),
        strip.text = element_blank(),
        strip.background = element_blank())

结果: enter image description here

geom_segment 方法

第三步:为geom_segment 定义xxend 美学。 xend 是通过将 number 值与 purrr::accumulate 相加而创建的,但是在一个方向上是正尺度,另一个方向是负尺度. x 是由 xend 的滞后和每个尺度的不同起点创建的。

my_accumulate = function(number, scale) {
  accumulate(number*ifelse(scale == "-", -1, 1), sum, .dir = ifelse(scale[1] == "-", "backward", "forward"))}

all.gg.data = all.gg.data2 %>%
  filter(opinion != "N") %>%
  rbind(all.gg.data.N) %>%
  mutate(opinion = factor(opinion, levels = c("SD", "D", "N-", "N+", "A", "SA")),
         scale = ifelse(opinion %in% c("SD", "D", "N-"), "-", "+")) %>%
  arrange(item, occasion, opinion) %>%
  group_by(item, occasion, scale) %>%
  mutate(number = my_accumulate(number, scale),
         numberStart = if(scale[1] == "-") {c(stats::lag(number, -1)[-1], 0)} else {c(0, stats::lag(number)[-3])})

Obs:您可以认为明确定义每个柱的起点和终点是由 geom_col 自动完成的,这就是为什么这个方法更大。

第四步:scale_color_manual连接“N+”和“N-”。

ggplot(all.gg.data, aes(x = numberStart, xend = number,
                        y = item2, yend = item2, color = opinion)) +
  geom_segment(size = 2) +
  scale_y_discrete(labels = my_wrap_format) +
  scale_x_continuous(labels = scales::percent) +
  labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
  scale_color_manual(values = c(SD = "darkblue", D = "blue", `N-` = "grey",
                       `N+` = "grey", A = "red", SA = "darkred"),
                     labels = c("SD", "D", "N", "A", "SA"),
                     breaks = c("SD", "D", "N-", "A", "SA"))

结果: enter image description here

附录:完整代码

geom_segment方法:

my_wrap_format = function(x) {
  x[seq(2, length(x), 2)] = gsub(".+(- Occasion [1-2])", "\\1", x[seq(2, length(x), 2)])
  
  unlist(lapply(strwrap(x, width = 50, simplify = FALSE), 
                paste0, collapse = "\n"))}

my_accumulate = function(number, scale) {
  accumulate(number*ifelse(scale == "-", -1, 1), sum, .dir = ifelse(scale[1] == "-", "backward", "forward"))}

all.gg.data2 = df %>%
  mutate(item2 = paste(item, "\n- Occasion", occasion)) %>%
  group_by(item, occasion) %>%
  mutate(number = number / sum(number))

all.gg.data.N = all.gg.data2 %>%
  filter(opinion == "N") %>%
  {rbind(mutate(., number = number/2, opinion = "N-"),
         mutate(., number =  number/2, opinion = "N+"))}

all.gg.data = all.gg.data2 %>%
  filter(opinion != "N") %>%
  rbind(all.gg.data.N) %>%
  mutate(opinion = factor(opinion, levels = c("SD", "D", "N-", "N+", "A", "SA")),
         scale = ifelse(opinion %in% c("SD", "D", "N-"), "-", "+")) %>%
  arrange(item, occasion, opinion) %>%
  group_by(item, occasion, scale) %>%
  mutate(number = my_accumulate(number, scale),
         numberStart = if(scale[1] == "-") {c(stats::lag(number, -1)[-1], 0)} else {c(0, stats::lag(number)[-3])})

ggplot(all.gg.data, aes(x = numberStart, xend = number,
                        y = item2, yend = item2, color = opinion)) +
  geom_segment(size = 10) +
  scale_y_discrete(labels = my_wrap_format) +
  scale_x_continuous(labels = scales::percent) +
  labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
  scale_color_manual(values = c(SD = "darkblue", D = "blue", `N-` = "grey",
                       `N+` = "grey", A = "red", SA = "darkred"),
                     labels = c("SD", "D", "N", "A", "SA"),
                     breaks = c("SD", "D", "N-", "A", "SA"))

分面 geom_col 方法:

all.gg.data2 = df %>%
  mutate(item2 = paste(item, "\n- Occasion", occasion)) %>%
  group_by(item, occasion) %>%
  mutate(number = number / sum(number))

all.gg.data.N = all.gg.data2 %>%
  filter(opinion == "N") %>%
  {rbind(mutate(., number = number/2, opinion = "N-"),
         mutate(., number =  number/2, opinion = "N+"))}

all.gg.data = all.gg.data2 %>%
  filter(opinion != "N") %>%
  rbind(all.gg.data.N) %>%
  mutate(opinion = factor(opinion, levels = c("SD", "D", "N-", "SA", "A", "N+")),
         scale = ifelse(opinion %in% c("SD", "D", "N-"), "-", "+"),
         number = ifelse(scale == "-", -number, number))

ggplot(all.gg.data, aes(x = number, y = item2, fill = opinion)) +
  geom_col() +
  facet_wrap(vars(scale), ncol = 2, scales = "free_x") +
  scale_y_discrete(labels = my_wrap_format) +
  scale_x_continuous(labels = scales::percent, expand = c(0, 0)) +
  labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
  scale_fill_manual(values = c(SD = "darkblue", D = "blue", `N-` = "grey",
                                `N+` = "grey", A = "red", SA = "darkred"),
                     labels = c("SD", "D", "N", "A", "SA"),
                     breaks = c("SD", "D", "N-", "A", "SA")) +
  theme(panel.spacing = unit(0, "cm"), panel.border = element_blank(),
        strip.text = element_blank(), strip.background = element_blank())

关于r - 带有 ggplot2 的堆叠蝴蝶/ Tornado 图,以零为中心,用于可视化前后李克特问卷,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74232906/

相关文章:

r - 使用 react 函数作为 ggvis 输入

arrays - 如何在Elasticsearch中按值数组查询/过滤精确计数?

r - 在ggplot2中按模态排序facet_grid或facet_wrap网格

R Blogdown Hugo 学术主题未渲染站点

R基于不同列的运行计数

r - ggplot : gradient scale to diverge on specific break

r - 单独保存图(ggplot2)

eclipse - 无法在 Eclipse 中激活 JPA Facet

python - 使用 RPy 进行多处理安全吗?

r - 在 ggplot2 中使用填充值绘制经度纬度时出错