我正在尝试根据地区绘制有关工资的箱线图。
这是我的数据集的样本(由研究所提供)
> head(final2, 20)
nquest nord ireg staciv etalav acontrib nome_reg tpens pesofit
1 173 1 18 3 25 35 Calabria 1800 0.3801668
2 2886 1 13 1 26 35 Abruzzo 1211 0.2383701
3 2886 2 13 1 20 42 Abruzzo 2100 0.2383701
4 5416 1 8 3 16 30 Emilia Romagna 700 0.8819879
5 7886 1 9 1 22 35 Toscana 2000 1.2452078
6 20297 1 5 1 14 39 Veneto 1200 1.6694498
7 20711 2 4 1 15 37 Trentino 2000 3.3746801
8 22169 1 15 4 40 5 Campania 600 1.6875562
9 22276 1 8 2 18 37 Emilia Romagna 1200 2.1782894
10 22286 1 8 1 15 19 Emilia Romagna 850 3.0333999
11 22286 2 8 1 15 35 Emilia Romagna 650 3.0333999
12 22657 1 16 1 25 40 Puglie 1400 0.3616937
13 22657 2 16 1 26 36 Puglie 1500 0.3616937
14 23490 1 5 2 23 36 Veneto 1400 0.9763965
15 24147 1 4 1 26 35 Trentino 1730 1.2479984
16 24147 2 4 1 18 45 Trentino 1600 1.2479984
17 24853 1 11 1 18 38 Marche 2180 0.3475683
18 27238 1 12 1 16 31 Lazio 1050 3.6358952
19 27730 1 20 1 15 37 Sardegna 1470 0.7232677
20 27734 1 20 1 16 45 Sardegna 1159 0.6959107
变量:
nquest
= 是家族代码nord
= 是家族的组成部分nome_reg
= 是他们居住的区域tpens
= 是他们每个人赚取的工资pesofit
= 是每个观察值的权重
这是我正在使用的代码
final2 %>%
filter(nome_reg == "Piemonte"|
nome_reg == "Valle D'Aosta" |
nome_reg == "Lombardia" |
nome_reg == "Liguria"
) %>%
ggplot(aes( x = factor(nome_reg,
levels=c("Piemonte", "Valle D'Aosta", "Lombardia", "Liguria")),
y = tpens , fill = nome_reg ))+
geom_boxplot(varwidth = TRUE)
这给了我这个情节
有没有办法绘制加权箱线图?我的意思是一个箱线图,它考虑了每个观察的权重(在本例中是每个区域每个人的工资tpens)?
我已经在执行加权回归,因此我想可视化加权数据
我在 aes 中尝试过 weight = pesofit
final2 %>%
filter(nome_reg == "Piemonte"|
nome_reg == "Valle D'Aosta" |
nome_reg == "Lombardia" |
nome_reg == "Liguria") %>%
ggplot(aes( x = factor(nome_reg, levels=c("Piemonte", "Valle D'Aosta", "Lombardia", "Liguria")),
y = tpens , fill = nome_reg, weight = pesofit ))+
geom_boxplot(varwidth = TRUE)
但 R 回答了
Warning message:
The following aesthetics were dropped during statistical transformation: weight
i This can happen when ggplot fails to infer the correct grouping structure in the data.
i Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?
如何解决?
最佳答案
基于一个简单的示例,尽管有警告,但指定权重似乎达到了预期效果,请参阅以下简单示例,了解权重如何影响绘图:
set.seed(0)
tmp <- data.frame(x=rnorm(100)) #Some random data to plot
tmp$y <- ifelse(tmp$x>0, 1, 0.1) #weight positive values highly
ggplot(tmp, aes(x=x)) + geom_boxplot()
ggplot(tmp, aes(x=x, weight=y)) + geom_boxplot()
#Warning message:
#The following aesthetics were dropped during statistical transformation: weight
#ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.
#ℹ Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?
该警告似乎可能是虚假的,可能与 this bug 有关。
关于r - ggplot 和箱线图 : is it possible to add weights?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75682432/