r - 绘制密度差异语义数据集

标签 r dataframe ggplot2 plot

我来这里是因为经过几个小时的研究和失败的试验,我不知道下一步该做什么。

我有一个数据库(通过 dyplr 使用 open_excel 命令打开),如下所示(但更复杂,有更多变量):

> dput(open)
structure(list(Subject = c(1, 2, 3, 4, 5), `Happy - Before` = c(4, 
4, 2, 1, 7), `Courageous - Before` = c(5, 2, 1, 3, 4), `Strange - Before` = c(1, 
2, 1, 4, 6), `Happy - After` = c(4, 2, 6, 2, 2), `Courageous - After` = c(7, 
1, 5, 1, 2), `Strange - After` = c(3, 7, 4, 5, 4)), row.names = c(NA, 
-5L), class = c("tbl_df", "tbl", "data.frame"))


# A tibble: 5 x 7
  Subject `Happy - Before` `Courageous - B… `Strange - Befo… `Happy - After`
    <dbl>            <dbl>            <dbl>            <dbl>           <dbl>
1       1                4                5                1               4
2       2                4                2                2               2
3       3                2                1                1               6
4       4                1                3                4               2
5       5                7                4                6               2
# … with 2 more variables: `Courageous - After` <dbl>, `Strange - After` <dbl>

我的目标是绘制具有一定特异性的密度图:

Density of scores obtained by all the subjects for each ability trait on a scale from 1 to 7

正如你在我的(可怕的)图表上看到的,我试图通过密度图以 1 到 7(x 轴)的范围显示所有受试者的 react ,但对于我拥有的每个特征(y - 轴),因此与[测试]之前和测试之后做出的响应分开。我需要得到同样类型的图例(勇敢的显示在左边,而不勇敢的显示在右边)。参与者回答量表时,越接近7,他就越[快乐、勇敢、沮丧、焦虑……],越接近1,他就越[不高兴、不高兴……]勇敢,不沮丧……] 我尽了最大努力(使用 ggplot2 模板,尝试融化一切,但我对 R 和语言编程相当陌生:/) 我所有的变量都有这样的名称:[trait1]_before[trait2]_before[trait1]_After[trait2 ]_之后

我希望这篇文章是清楚的。如果没有,我很乐意添加信息! 谢谢大家(抱歉我的英语不好)

最佳答案

密度图假设沿 x 轴有一个连续变量,而您的示例仅从 1 到 7。这​​意味着您可以绘制密度,其中尾部将超过 0 和 7,或者强制在这些值。

data <- structure(list(Subject = c(1, 2, 3, 4, 5),
                   `Happy - Before` = c(4, 4, 2, 1, 7),
                   `Courageous - Before` = c(5, 2, 1, 3, 4),
                   `Strange - Before` = c(1, 2, 1, 4, 6),
                   `Happy - After` = c(4, 2, 6, 2, 2),
                   `Courageous - After` = c(7, 1, 5, 1, 2),
                   `Strange - After` = c(3, 7, 4, 5, 4)),
              row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))

library(tidyverse)
library(ggplot2)
library(ggridges)
library(grid)
library(gtable)

dataPivot <- data %>% 
  pivot_longer(-Subject, names_to = "measure", values_to = "score") %>% 
  mutate(status = sub(".* - ", "", measure),
         feature = sub(" - .*", "", measure),
         featureOpposite = paste('Not', feature)) %>% 
  mutate_if(is.character, as.factor)

如果您不想截断绘图:

# Create the first plot with the axis on the left
p1 <- ggplot(dataTest, aes(x = score, y = feature)) +
  geom_density_ridges2(aes(fill = status), scale = 0.7, alpha = .3) +
  scale_x_continuous(breaks = c(1, 7)) +
  labs(y = NULL) +
  theme_ridges() + 
  theme(legend.position="bottom")

# Create a second plot with the legend on the right
p2 <- ggplot(dataTest, aes(x = score, y = featureOpposite)) +
  geom_density_ridges2(aes(fill = status), scale = 0.7, alpha = .3) +
  scale_y_discrete(position = "right") +
  theme_ridges() + 
  theme(legend.position="bottom")

# Convert both plots to gtables
g1 <- ggplot_gtable(ggplot_build(p1))
g2 <- ggplot_gtable(ggplot_build(p2))

# Add an empty column to the left side of the first plot to make room for the right
# axis
g1 <- gtable_add_cols(g1, widths = unit(0.2, "null"), pos = -1)

# overlap the panel of the 2nd plot on that of the 1st plot
pp <- c(subset(g1$layout, name == "axis-r", se = t:r))

# Add the y-axis from the second plot
g <- gtable_add_grob(g1, g2$grobs[[which(g2$layout$name == "axis-r")]], pp$t, pp$r, 
  pp$b, pp$r)

grid.draw(g)

如果你想截断末端:

# Here we are just adding in a height variable, changing stat to density and adding
# trim = T
p1 <- ggplot(dataTest, aes(x = score, y = feature, height = ..density..)) +
  geom_density_ridges2(aes(fill = status), scale = 0.7, alpha = .3, stat = "density", 
  trim = TRUE) +
  scale_x_continuous(breaks = c(1, 7)) +
  labs(y = NULL) +
  theme_ridges() + 
  theme(legend.position="bottom")

p2 <- ggplot(dataTest, aes(x = score, y = featureOpposite, height = ..density..)) +
  geom_density_ridges2(aes(fill = status), scale = 0.7, alpha = .3, stat = "density",             
  trim = TRUE) +
  scale_y_discrete(position = "right") +
  theme_ridges() + 
  theme(legend.position="bottom")

g1 <- ggplot_gtable(ggplot_build(p1))
g2 <- ggplot_gtable(ggplot_build(p2))

g1 <- gtable_add_cols(g1, widths = unit(0.2, "null"), pos = -1)

## overlap the panel of the 2nd plot on that of the 1st plot
pp <- c(subset(g1$layout, name=="axis-r", se=t:r))

g <- gtable_add_grob(g1, g2$grobs[[which(g2$layout$name=="axis-r")]], pp$t, pp$r, 
  pp$b, pp$r)

grid.draw(g)

关于r - 绘制密度差异语义数据集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59428935/

相关文章:

r - 无法从shinyapps.io连接到Microsoft Azure

r - grid.Call(L_textBounds,as.graphicsAnnot(x $ label),x $ x,x $ y,: polygon edge not found (new)中的错误

python - 如何在Python中将pandas数据框转换为矩阵格式?

r - 控制 map 图例中的值范围

r - 将正态分布拟合到分组数据,给出预期频率

r - 将参数传递给 dplyr 函数

r - 使用 "{{"运算符通过环境变量隧道数据变量

python - 如何在数据框中添加由字母数字组成的递增值的列和另一列包含随机数的列

python - 当数据帧上存在混合数据类型时,为什么我无法使用 loc 赋值?即一些列有字符串,其他列有数字

r - ggplot2 - 在矩阵图中添加分割线