r - 从一个 df 绘制点，从另一个 df 绘制误差条

原始数据如下所示:

Restaurant     Question               rating

McDonalds      How was the food?      5       
McDonalds      How were the drinks?   3     
McDonalds      How were the workers?  2     
Burger_King    How was the food?      1       
Burger_King    How were the drinks?   3       
Burger_King    How were the workers?  4

平均值如下:

Question              average_rating    error
How was the food?     3.13              0.7
How were the drinks?  2.37              0.56

如何使用原始数据绘制点图(x = 问题，y = 评级，填充 = 餐厅)，然后在其顶部绘制误差线(ymin/ymax = 平均评级 ± 误差) ？

tribble为了方便起见:

tribble(
  ~restaurant, ~question,  ~rating,
  "McDonalds", "How was the food?", 5,
  "McDonalds", "How were the drinks?", 3,
  "McDonalds", "How were the drinks?", 2,
  "BurgerKing", "How was the food?", 1,
  "BurgerKing", "How were the drinks?", 3,
  "BurgerKing", "How were the drinks?", 4
)

tribble(
  ~question, ~average_rating, ~error,
  "How was the food?", 3.13, 0.7,
  "How were the drinks?", 2.37, 0.56
)

最佳答案

您想要的输出与当前的数据帧不太一致。因为，您的第二个数据框包含每个餐厅的平均评分，而不是每个问题的平均评分(如 @StupidWolf 所概述)。因此，要么您想在 x 轴上绘制餐厅，这很容易做到，要么您需要合并两个数据帧并将 Average_ rating 设置为变量 question< 的离散值。

我对第二个选项执行以下操作:

library(dplyr)
df2 %>% mutate(question = "Average_rating") %>%
  rename(rating = average_rating) %>% full_join(df1,.) %>%
  mutate(restaurant = sub("BurgerKing","Burger_King",restaurant)) 
Joining, by = c("restaurant", "question", "rating")
# A tibble: 8 x 4
  restaurant  question             rating error
  <chr>       <chr>                 <dbl> <dbl>
1 McDonalds   How was the food?      5    NA   
2 McDonalds   How were the drinks?   3    NA   
3 McDonalds   How were the drinks?   2    NA   
4 Burger_King How was the food?      1    NA   
5 Burger_King How were the drinks?   3    NA   
6 Burger_King How were the drinks?   4    NA   
7 McDonalds   Average_rating         3.13  0.7 
8 Burger_King Average_rating         2.37  0.56

然后，如果您想添加绘图，可以执行以下操作:

library(ggplot2)
library(dplyr)
df2 %>% mutate(question = "Average_rating") %>%
  rename(rating = average_rating) %>% full_join(df1,.) %>%
  mutate(restaurant = sub("BurgerKing","Burger_King",restaurant)) %>%
  ggplot(aes(x = question, y= rating, color = restaurant))+
  geom_point(position = position_dodge(0.9))+
  geom_errorbar(aes(ymin = rating-error, ymax = rating+error), width = 0.1, position = position_dodge(0.9))

编辑:每个问题的绘图错误意味着

使用包含每个问题平均速率的新数据框，您可以使用geom_pointrange，如下所示:

ggplot(df1, aes(x = question, y = rating, color = restaurant))+
  geom_jitter(width = 0.2)+
  geom_pointrange(inherit.aes = FALSE,
                  data = df3, 
                  aes(x = question, 
                      y = average_rating,
                      ymin = average_rating-error,
                      ymax = average_rating+error))

它能回答你的问题吗？

关于r - 从一个 df 绘制点，从另一个 df 绘制误差条，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60233796/

r - 从一个 df 绘制点，从另一个 df 绘制误差条

上一篇：r - 返回单个条形的堆积条形图

下一篇：react-native - 如何启用新的 LogBox (RN)