我有来自教育模块之前和之后的问卷调查的数据。我正在尝试使用 ggplot 在堆叠条形图中可视化数据,更具体地说是堆叠蝴蝶/ Tornado 图,我想让两个不同的场合(= 1 之前,= 2 之后)充当水平面。我已经手绘了我想要的图片,还有我目前得到的以及我的ggplot代码。
我有什么:
我需要什么:
我的代码:
likert_viz <- ggplot(all.gg.data, aes(x = number, y = item, fill = opinion)) +
geom_col(width = 5.0) +
geom_vline(xintercept = 50, color = c("white")) +
scale_y_discrete(limits = all.gg.data$item, labels = scales::wrap_format(50)) +
scale_x_continuous(labels = scales::percent) +
labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
facet_wrap(~occasion)
我的数据:(https://pastebin.com/CdgSseKJ)
我尝试过 facet,希望能够按照我想要的方式重新排列它们,但未能成功排列它们,以便一次一个问题的“之前”答案堆叠在“之后”答案之上。希望得到有用的建议或解决方案。
最佳答案
这相当棘手,我找到了两个解决方案:为正面意见创建一个面,为负面意见创建一个面,并删除面之间的空格;明确定义每个柱的开始和结束并将其传递给 geom_segment
而不是 geom_col
。
公共(public)部分
独立于您喜欢的方法,有很多共同点:
您问的第一件事是将不同场合的酒吧堆叠在一起。这可以通过在 geom_...
中添加 position = "dodge"
或 "dodge2"
来完成;或者通过执行 paste(item, "- Occasion", occasion)
创建一个新的 y 轴。第一个选项更优雅,但我无法让它工作。
第一步:创建新的 y 轴,并将 number
转换为频率(以便 x 轴从 -100% 变为 100%):
all.gg.data2 = df %>%
mutate(item2 = paste(item, "- Occasion", occasion)) %>%
group_by(item, occasion) %>%
mutate(number = number / sum(number))
第二步:因为我们希望 N 以零为中心,所以将该意见分成两半,即“积极部分”和“消极部分”
all.gg.data.N = all.gg.data2 %>%
filter(opinion == "N") %>%
{rbind(mutate(., number = number/2, opinion = "N-"),
mutate(., number = number/2, opinion = "N+"))}
第三步:添加新的 N 行,创建 scale
变量,并根据我们想要的方法转换 number
列。 第四步:绘图。现在我按方法来解释。
Obs:两者都使用自定义函数(基于 scales::wrap_format
)从 y 轴移除额外的文本:
my_wrap_format = function(x) {
x[seq(2, length(x), 2)] = gsub(".+(- Occasion [1-2])", "\\1", x[seq(2, length(x), 2)])
unlist(lapply(strwrap(x, width = 50, simplify = FALSE),
paste0, collapse = "\n"))}
分面geom_col
方法
第三步:为负数部分更改number
的符号。
all.gg.data = all.gg.data2 %>%
filter(opinion != "N") %>%
rbind(all.gg.data.N) %>%
mutate(opinion = factor(opinion, levels = c("SD", "D", "N-", "SA", "A", "N+")),
scale = ifelse(opinion %in% c("SD", "D", "N-"), "-", "+"),
number = ifelse(scale == "-", -number, number))
第四步:根据scale
创建facet(facet_wrap(vars(scale))
),去除x轴多余空间( expand = c(0, 0)
和 panel.spacing = unit(0, "cm")
), 并用 连接 "N+"和 "N-">scale_fill_manual
。
ggplot(all.gg.data, aes(x = number, y = item2, fill = opinion)) +
geom_col() +
facet_wrap(vars(scale), ncol = 2, scales = "free_x") +
scale_y_discrete(labels = my_wrap_format) +
scale_x_continuous(labels = scales::percent, expand = c(0, 0)) +
labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
scale_fill_manual(values = c(SD = "darkblue", D = "blue", `N-` = "grey",
`N+` = "grey", A = "red", SA = "darkred"),
labels = c("SD", "D", "N", "A", "SA"),
breaks = c("SD", "D", "N-", "A", "SA")) +
theme(panel.spacing = unit(0, "cm"),
strip.text = element_blank(),
strip.background = element_blank())
geom_segment
方法
第三步:为geom_segment
定义x
和xend
美学。 xend
是通过将 number
值与 purrr::accumulate
相加而创建的,但是在一个方向上是正尺度,另一个方向是负尺度. x
是由 xend
的滞后和每个尺度的不同起点创建的。
my_accumulate = function(number, scale) {
accumulate(number*ifelse(scale == "-", -1, 1), sum, .dir = ifelse(scale[1] == "-", "backward", "forward"))}
all.gg.data = all.gg.data2 %>%
filter(opinion != "N") %>%
rbind(all.gg.data.N) %>%
mutate(opinion = factor(opinion, levels = c("SD", "D", "N-", "N+", "A", "SA")),
scale = ifelse(opinion %in% c("SD", "D", "N-"), "-", "+")) %>%
arrange(item, occasion, opinion) %>%
group_by(item, occasion, scale) %>%
mutate(number = my_accumulate(number, scale),
numberStart = if(scale[1] == "-") {c(stats::lag(number, -1)[-1], 0)} else {c(0, stats::lag(number)[-3])})
Obs:您可以认为明确定义每个柱的起点和终点是由 geom_col
自动完成的,这就是为什么这个方法更大。
第四步:用scale_color_manual
连接“N+”和“N-”。
ggplot(all.gg.data, aes(x = numberStart, xend = number,
y = item2, yend = item2, color = opinion)) +
geom_segment(size = 2) +
scale_y_discrete(labels = my_wrap_format) +
scale_x_continuous(labels = scales::percent) +
labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
scale_color_manual(values = c(SD = "darkblue", D = "blue", `N-` = "grey",
`N+` = "grey", A = "red", SA = "darkred"),
labels = c("SD", "D", "N", "A", "SA"),
breaks = c("SD", "D", "N-", "A", "SA"))
附录:完整代码
geom_segment
方法:
my_wrap_format = function(x) {
x[seq(2, length(x), 2)] = gsub(".+(- Occasion [1-2])", "\\1", x[seq(2, length(x), 2)])
unlist(lapply(strwrap(x, width = 50, simplify = FALSE),
paste0, collapse = "\n"))}
my_accumulate = function(number, scale) {
accumulate(number*ifelse(scale == "-", -1, 1), sum, .dir = ifelse(scale[1] == "-", "backward", "forward"))}
all.gg.data2 = df %>%
mutate(item2 = paste(item, "\n- Occasion", occasion)) %>%
group_by(item, occasion) %>%
mutate(number = number / sum(number))
all.gg.data.N = all.gg.data2 %>%
filter(opinion == "N") %>%
{rbind(mutate(., number = number/2, opinion = "N-"),
mutate(., number = number/2, opinion = "N+"))}
all.gg.data = all.gg.data2 %>%
filter(opinion != "N") %>%
rbind(all.gg.data.N) %>%
mutate(opinion = factor(opinion, levels = c("SD", "D", "N-", "N+", "A", "SA")),
scale = ifelse(opinion %in% c("SD", "D", "N-"), "-", "+")) %>%
arrange(item, occasion, opinion) %>%
group_by(item, occasion, scale) %>%
mutate(number = my_accumulate(number, scale),
numberStart = if(scale[1] == "-") {c(stats::lag(number, -1)[-1], 0)} else {c(0, stats::lag(number)[-3])})
ggplot(all.gg.data, aes(x = numberStart, xend = number,
y = item2, yend = item2, color = opinion)) +
geom_segment(size = 10) +
scale_y_discrete(labels = my_wrap_format) +
scale_x_continuous(labels = scales::percent) +
labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
scale_color_manual(values = c(SD = "darkblue", D = "blue", `N-` = "grey",
`N+` = "grey", A = "red", SA = "darkred"),
labels = c("SD", "D", "N", "A", "SA"),
breaks = c("SD", "D", "N-", "A", "SA"))
分面 geom_col
方法:
all.gg.data2 = df %>%
mutate(item2 = paste(item, "\n- Occasion", occasion)) %>%
group_by(item, occasion) %>%
mutate(number = number / sum(number))
all.gg.data.N = all.gg.data2 %>%
filter(opinion == "N") %>%
{rbind(mutate(., number = number/2, opinion = "N-"),
mutate(., number = number/2, opinion = "N+"))}
all.gg.data = all.gg.data2 %>%
filter(opinion != "N") %>%
rbind(all.gg.data.N) %>%
mutate(opinion = factor(opinion, levels = c("SD", "D", "N-", "SA", "A", "N+")),
scale = ifelse(opinion %in% c("SD", "D", "N-"), "-", "+"),
number = ifelse(scale == "-", -number, number))
ggplot(all.gg.data, aes(x = number, y = item2, fill = opinion)) +
geom_col() +
facet_wrap(vars(scale), ncol = 2, scales = "free_x") +
scale_y_discrete(labels = my_wrap_format) +
scale_x_continuous(labels = scales::percent, expand = c(0, 0)) +
labs(title = "Opinion", subtitle = "Subtitle", x = "Percentage", y = "", color = "") +
scale_fill_manual(values = c(SD = "darkblue", D = "blue", `N-` = "grey",
`N+` = "grey", A = "red", SA = "darkred"),
labels = c("SD", "D", "N", "A", "SA"),
breaks = c("SD", "D", "N-", "A", "SA")) +
theme(panel.spacing = unit(0, "cm"), panel.border = element_blank(),
strip.text = element_blank(), strip.background = element_blank())
关于r - 带有 ggplot2 的堆叠蝴蝶/ Tornado 图,以零为中心,用于可视化前后李克特问卷,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74232906/