r - 在ggplot中具有不同宽度的堆叠条形图

标签 r ggplot2

我尝试构建一个具有不同宽度的堆积条形图,以便宽度表示分配的平均数量,而高度表示分配的数量。

接下来,您会发现我的可重复数据:

procedure = c("method1","method2", "method3", "method4","method1","method2", "method3", "method4","method1","method2", "method3","method4")
sector =c("construction","construction","construction","construction","delivery","delivery","delivery","delivery","service","service","service","service") 
number = c(100,20,10,80,75,80,50,20,20,25,10,4)
amount_mean = c(1,1.2,0.2,0.5,1.3,0.8,1.5,1,0.8,0.6,0.2,0.9) 

data0 = data.frame(procedure, sector, number, amount_mean)

使用 geom_bar 并在 aes 中包含宽度时,我收到以下错误消息:

position_stack requires non-overlapping x intervals. Furthermore, the bars are no longer stacked. 

bar<-ggplot(data=data0,aes(x=sector,y=number,fill=procedure, width = amount_mean)) + 
geom_bar(stat="identity") 

我还查看了 mekko-package,但似乎这仅适用于条形图。

这是我最终想要的(不是基于上述数据):

desired Outcome (not based on above data)

知道如何解决我的问题吗?

最佳答案

我也试过,geom_col()也一样,但我遇到了同样的问题 - position = "stack"似乎我们不能分配一个 width参数没有拆开。

但事实证明,该解决方案非常简单——我们可以使用 geom_rect() “手工”构建这样的情节。

有你的数据:

df = data.frame(
  procedure   = rep(paste("method", 1:4), times = 3),
  sector      = rep(c("construction", "delivery", "service"), each = 4),
  amount      = c(100, 20, 10, 80, 75, 80, 50, 20, 20, 25, 10, 4),
  amount_mean = c(1, 1.2, 0.2, 0.5, 1.3, 0.8, 1.5, 1, 0.8, 0.6, 0.2, 0.9)
)

起初我已经转换了你的数据集:
df <- df %>%
  mutate(amount_mean = amount_mean/max(amount_mean),
         sector_num = as.numeric(sector)) %>%
  arrange(desc(amount_mean)) %>%
  group_by(sector) %>%
  mutate(
    xmin = sector_num - amount_mean / 2,
    xmax = sector_num + amount_mean /2,
    ymin = cumsum(lag(amount, default = 0)), 
    ymax = cumsum(amount)) %>%
  ungroup()

我在这里做什么:
  • 我缩小了amount_mean ,所以 0 >= amount_mean <= 1 (更适合绘图,无论如何我们没有另一个比例来显示 amount_mean 的真实值);
  • 我也解码了sector变量转换为数字(用于绘图,见下文);
  • 我已按 amount_mean 按降序排列数据集(重表示 - 在底部,轻表示在顶部);
  • 按部门分组,我计算出xmin , xmax代表amount_mean , 和 ymin , ymax为金额。前两个比较麻烦。 ymax很明显 - 您只需为所有 amount 取一个累积总和从第一个开始。您需要累积和来计算 ymin同样,但从 0 开始。所以第一个矩形用 ymin = 0 绘制,第二个 - 与 ymin = ymax以前的三角形等。所有这些都是在 sector 的每个单独组中执行的。 s。

  • 绘制数据:
    df %>%
      ggplot(aes(xmin = xmin, xmax = xmax,
                 ymin = ymin, ymax = ymax, 
                 fill = procedure
                 )
             ) +
      geom_rect() +
      scale_x_continuous(breaks = df$sector_num, labels = df$sector) +
      #ggthemes::theme_tufte() +
      theme_bw() +
      labs(title = "Question 51136471", x = "Sector", y = "Amount") +
      theme(
        axis.ticks.x = element_blank()
        )
    

    结果:

    pyramid_plot

    阻止 procedure 的另一种选择要重新排序的变量。所以都可以说“红色”在下方,“绿色”在上方等等。但它看起来很难看:
    df <- df %>%
      mutate(amount_mean = amount_mean/max(amount_mean),
             sector_num = as.numeric(sector)) %>%
      arrange(procedure, desc(amount), desc(amount_mean)) %>%
      group_by(sector) %>%
      mutate(
        xmin = sector_num - amount_mean / 2,
        xmax = sector_num + amount_mean /2,
        ymin = cumsum(lag(amount, default = 0)), 
        ymax = cumsum(amount)
        ) %>%
      ungroup()
    

    pyramid_plot_ugly

    关于r - 在ggplot中具有不同宽度的堆叠条形图,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51136471/

    相关文章:

    javascript - 在给定 JavaScript 语句的情况下使用 R 下载文件

    r - 使用 slider 在 Shiny 的应用程序中清除传单标记

    r - 多面ggplot条形图中您实验室的百分比?

    r - 如何使用一个变量连续填充 ggplot2 条形图

    r - ggplot中对齐图的风险

    如果任何连续值不满足阈值,则删除 ID

    在 R 中多次运行一个函数

    r - ggplot 中组之间不需要的线

    r - 在主标题上方放置图例

    r - 条件和分组变异 dplyr