python - 使用 pandas matplotlib 在 x 轴上显示文本描述而不是数字

标签 python pandas matplotlib data-visualization

我已经编写了代码来将我的数据集显示为条形图。这是我的代码: 我已通过以下方式从 .csv 文件读取数据:

names = ["Clinic Number","Question Text","Answer Text","Answer Date","Class"]
data = pd.read_csv('ADLCI.csv', names = names)

然后

grouped = data.groupby(['Question Text','Answer Text']).size().reset_index(name='counts')

import matplotlib.pyplot as plt
plt.figure()

grouped.plot(kind='bar', title ="Functional Status Count", figsize=(15, 10), legend=True, fontsize=12)
plt.show()

这也是我想要显示为条形图的数据框的结果。

                         Question Text Answer Text  counts
0                          CI function          No     513
1                          CI function         Yes     373
2                             bathing?          No    2827
3                             bathing?         Yes     408
4                            dressing?          No    2824
5                            dressing?         Yes     423
6                              feeding          No    2851
7                              feeding         Yes     160
8                         housekeeping          No    2803
9                         housekeeping         Yes     717
10                      preparing food          No    2604
11                      preparing food         Yes     593
12  responsibility for own medications          No    2793
13  responsibility for own medications         Yes     625
14                            shopping          No      35
15                            shopping         Yes      49
16                           toileting          No    2843
17                           toileting         Yes     239
18                        transferring          No    2834
19                        transferring         Yes     904
20                using transportation          No    2816
21                using transportation         Yes     483

第一列数字已自动添加,实际上我的数据集中没有该列。

这是由此代码创建的条形图。 enter image description here

正如您在条形图中看到的那样,所有条形都具有相同的颜色。 x 轴也是我所说的数字。但我不想要这种形状。 我想要的东西看起来像 this link :

我将解释我想要对此处上传的图片进行哪些更改。

它应该描绘问题文本列,而不是x轴中的0和1 ...。具体来说,x 轴的条形图将是:正如我们在数据框中看到的,有两个 CI 函数,一个表示 yes,一个表示 No >。我想要 CI 函数 而不是 0 和 1,有两种不同的颜色,一种指向 No 1596 的计数,另一种不同的颜色指向 是的1376

下一项将是洗澡?,同样有一个条指向17965,另一个指向702

有了这个,我应该有近十个条,每个条包含两个相互粘在一起的条,就像我上面放置的链接一样。

我尝试了各种方法,例如上面的链接,但我的没有显示那样或出现错误。

谢谢:)

更新 1 当我应用你的代码时:

import matplotlib.pyplot as plt
data.groupby(['Question Text','Answer Text']).sum().unstack().plot(kind='bar')
plt.show()

我收到此错误:

  Traceback (most recent call last):
  File "C:/Users/M193053/PycharmProjects/ADL-distribution/test.py", line 52, in <module>
    data.groupby(['Question Text','Answer Text']).sum().unstack().plot(kind='bar')
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 2941, in __call__
    sort_columns=sort_columns, **kwds)
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 1977, in plot_frame
    **kwds)
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 1804, in _plot
    plot_obj.generate()
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 258, in generate
    self._compute_plot_data()
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 373, in _compute_plot_data
    'plot'.format(numeric_data.__class__.__name__))
TypeError: Empty 'DataFrame': no numeric data to plot

但是当我使用这段代码时:

grouped = data.groupby(['Question Text','Answer Text']).size().reset_index(name='counts')

import matplotlib.pyplot as plt
grouped.groupby(['Question Text','Answer Text']).sum().unstack().plot(kind='bar')
plt.show()

我觉得这样就可以了: enter image description here

但是应用两个groupby似乎不合逻辑。因此我仍然不确定我应该做什么。 感谢您抽出时间:)

更新两个

这是我的数据框,已通过以下代码获得:

grouped = data.groupby(['Question Text','Answer Text']).size().reset_index(name='counts')

0                          CI function          No     513
1                          CI function         Yes     373
2                             bathing?          No    2827
3                             bathing?         Yes     408
4                            dressing?          No    2824
5                            dressing?         Yes     423
6                              feeding          No    2851
7                              feeding         Yes     160
8                         housekeeping          No    2803
9                         housekeeping         Yes     717
10                      preparing food          No    2604
11                      preparing food         Yes     593
12  responsibility for own medications          No    2793
13  responsibility for own medications         Yes     625
14                            shopping          No      35
15                            shopping         Yes      49
16                           toileting          No    2843
17                           toileting         Yes     239
18                        transferring          No    2834
19                        transferring         Yes     904
20                using transportation          No    2816
21                using transportation         Yes     483

这是数据框,是从你的代码和我的代码组合中获得的:

grouped = data.groupby(['Question Text','Answer Text']).size().reset_index(name='counts')
print(grouped)
import matplotlib.pyplot as plt
final = grouped.groupby(['Question Text','Answer Text']).sum()
print(final)


Question Text                      Answer Text        
CI function                        No              513
                                   Yes             373
bathing?                           No             2827
                                   Yes             408
dressing?                          No             2824
                                   Yes             423
feeding                            No             2851
                                   Yes             160
housekeeping                       No             2803
                                   Yes             717
preparing food                     No             2604
                                   Yes             593
responsibility for own medications No             2793
                                   Yes             625
shopping                           No               35
                                   Yes              49
toileting                          No             2843
                                   Yes             239
transferring                       No             2834
                                   Yes             904
using transportation               No             2816
                                   Yes             483

更新3

原始数据框有 200000 行,如下所示:

1                             bathing?          No       3529933
2                            dressing?          No       3529933
3                              feeding          No       3529933
4                         housekeeping          No       3529933
5   responsibility for own medications          No       3529933
6                 using transportation          No       3529933
7                            toileting          No       3529933
8                         transferring          No       3529933
10                      preparing food          No       3529933
11                            bathing?         NaN       2864155
12                           dressing?         NaN       2864155
13                             feeding         NaN       2864155
14                        housekeeping         NaN       2864155
15  responsibility for own medications         NaN       2864155
16                           toileting         NaN       2864155
17                        transferring         NaN       2864155
19                      preparing food         NaN       2864155
20                using transportation         Yes       2864155
21                            bathing?         NaN       2921299
22                           dressing?         NaN       2921299

最佳答案

你可以这样做(df是你编写的数据框):

import matplotlib
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
df.groupby(['Question Text','Answer Text']).sum().unstack().plot(kind='bar')
plt.show()

输出: enter image description here 您还可以通过以下方式旋转 xlabel:

plt.xticks(rotation=45)

但我建议您缩短标签以使其更清晰

关于python - 使用 pandas matplotlib 在 x 轴上显示文本描述而不是数字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50765643/

相关文章:

python - 在 matplotlib 中调整图形质量

python - 导入matplotlib.pyplot时如何修复 "object has no attribute"错误

javascript - 如何使用 selenium 获取 javascript 结果?

python - Django - 属性错误 : module 'os' has no attribute 'environment'

python - 尝试通过 pandas : Can't load plugin: sqlalchemy. 方言连接到 ibm db2 数据库时出错:ibm_db_sa

python - Pandas 中的 Iterrows 合并会导致带有后缀的重复列

python - matplotlib 散点图 : the more overlapping points the bigger the marker

Python 分析一个内部函数

python - 将 dict 传递给 scikit learn estimator

python - 如何通过检查条件来替换数据框中的值?