pandas - 如何将多组列组合为 pandas 中的条形图

标签 pandas matplotlib

我有3张 table

<表类=“s-表”> <标题> 早上 网球得分 棒球得分 <正文> jack 10 2 艾玛 5 6 西奥 5 8
<表类=“s-表”> <标题> 中午 网球得分 棒球得分 <正文> jack 5 4 艾玛 8 2 西奥 12 9
<表类=“s-表”> <标题> 晚上 网球得分 棒球得分 <正文> jack 3 3 艾玛 6 7 西奥 8 3

我需要将表中的不同列组合起来并得到这样的图表。


dataf = pd.read_csv('score.txt', names=["Tennis","Baseball"])

dataf.plot(kind='bar')

我得到的情节是:

Seperate bars

这里的值位于不同的 x 轴索引中。我如何将它们组合在一起?我还必须将不同表中的值组合起来并绘制在同一个图中。最终的情节应该是这样的: Final plot

如何将列分组在一起并将其绘制在同一个图表中?

最佳答案

给定这些数据帧:

m_df = pd.DataFrame({
    'Morning': ['Jack', 'Emma', 'Theo'],
    'Tennis Score': [10, 5, 5],
    'Baseball Score': [2, 6, 8]
})

n_df = pd.DataFrame({
    'Noon': ['Jack', 'Emma', 'Theo'],
    'Tennis Score': [5, 8, 12],
    'Baseball Score': [4, 2, 9]
})

e_df = pd.DataFrame({
    'Evening': ['Jack', 'Emma', 'Theo'],
    'Tennis Score': [3, 6, 8],
    'Baseball Score': [3, 7, 3]
})

它们应该是concat在一起,每个 DataFrame 应该接收一个指示符列。此外,MorningNoonEvening 列应为 renamed以便它们在单列中对齐:

keys = ['Morning', 'Noon', 'Evening']
plot_df = pd.concat(
    [df_.assign(id=label)
         .rename(columns={label: 'Player'})
         .set_index('Player')
     for df_, label in zip([m_df, n_df, e_df], keys)]
).reset_index()

plot_df:

  Player  Tennis Score  Baseball Score       id
0   Jack            10               2  Morning
1   Emma             5               6  Morning
2   Theo             5               8  Morning
3   Jack             5               4     Noon
4   Emma             8               2     Noon
5   Theo            12               9     Noon
6   Jack             3               3  Evening
7   Emma             6               7  Evening
8   Theo             8               3  Evening

然后pivot从长到宽,reindexMorningNoonEvening 顺序(而不是按字母顺序)获取索引,swaplevelsort_index这样列就按玩家而不是分数类型进行分组:

plot_df = (
    plot_df.pivot(index='id', columns='Player')
        .reindex(keys)
        .swaplevel(0, 1, 1)
        .sort_index(level=0, axis=1)
        .rename_axis(columns=["Player", 'Score Type'])
)

plot_df:

Player               Emma                        Jack                        Theo             
Score Type Baseball Score Tennis Score Baseball Score Tennis Score Baseball Score Tennis Score
id                                                                                            
Morning                 6            5              2           10              8            5
Noon                    2            8              4            5              9           12
Evening                 7            6              3            3              3            8

这可以简单地绘制:

进口:

import pandas as pd
from matplotlib import pyplot as plt

绘图代码:

fig, ax = plt.subplots()
plot_df.plot(kind='bar', rot=0, ax=ax, xlabel='', ylabel='Score')
plt.tight_layout()
plt.show()

plot 1 (simple plot)


或者更新 facecolorhatch补丁数量:

进口:

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

绘图代码:

fig, ax = plt.subplots()
plot_df.plot(kind='bar', rot=0, ax=ax, legend=False, xlabel='', ylabel='Score')

players = plot_df.columns.get_level_values(0).unique()
score_types = plot_df.columns.get_level_values(1).unique()
# Create hatches (should be same length as types of scores)
hatches = np.tile(np.repeat(['/', '.'], plot_df.shape[0]), len(players))
# Create Colors (should be same number of colours as number of players)
colours = np.repeat(['green', 'pink', 'purple'],
                    len(score_types) * plot_df.shape[0])
# Iterate over patches, colours, and hatches to set the facecolor and hatch
for patch, colour, hatch in zip(ax.patches, colours, hatches):
    patch.set_facecolor(colour)
    patch.set_hatch(hatch)

# Add legend:
ax.legend(loc=1)
plt.tight_layout()
plt.show()

plot 2 hatched and colors applied


或者更进一步使用自定义图例:

进口:

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib.lines import Line2D
from matplotlib.patches import Patch

绘图代码:

fig, ax = plt.subplots()
plot_df.plot(kind='bar', rot=0, ax=ax, legend=False, xlabel='', ylabel='Score')

players = plot_df.columns.get_level_values(0).unique()
score_types = plot_df.columns.get_level_values(1).unique()
colours = ['green', 'pink', 'purple']
hatches = ['/', '.']
iter_colours = np.repeat(colours, len(score_types) * plot_df.shape[0])
iter_hatches = np.tile(np.repeat(hatches, plot_df.shape[0]), len(players))
for patch, colour, hatch in zip(ax.patches, iter_colours, iter_hatches):
    patch.set_facecolor(colour)
    patch.set_hatch(hatch)

# Add legends:
player_legend = ax.legend(
    [Line2D([0], [0], color=colour, lw=4) for colour in colours],
    players, title='Players', loc=1)

score_legend = ax.legend(
    [Patch(hatch=hatch, facecolor='white') for hatch in hatches],
    score_types, loc=2, title='Score Type', labelspacing=.65)

for patch in score_legend.get_patches():
    patch.set_height(14)
    patch.set_y(-3)

ax.add_artist(player_legend)
plt.tight_layout()
plt.show()

plot 3 custom legends


importplt.show() 的 5 个玩家的完整工作示例:

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

m_df = pd.DataFrame({
    'Morning': ['Jack', 'Emma', 'Theo', 'Matt', 'Thomas'],
    'Tennis Score': [10, 5, 5, 7, 9],
    'Baseball Score': [2, 6, 8, 2, 4]
})

n_df = pd.DataFrame({
    'Noon': ['Jack', 'Emma', 'Theo', 'Matt', 'Thomas'],
    'Tennis Score': [5, 8, 12, 3, 10],
    'Baseball Score': [4, 2, 9, 5, 6]
})

e_df = pd.DataFrame({
    'Evening': ['Jack', 'Emma', 'Theo', 'Matt', 'Thomas'],
    'Tennis Score': [3, 6, 8, 4, 7],
    'Baseball Score': [3, 7, 3, 9, 5]
})
# concat
keys = ['Morning', 'Noon', 'Evening']
plot_df = pd.concat(
    [df_.assign(id=label)
         .rename(columns={label: 'Player'})
         .set_index('Player')
     for df_, label in zip([m_df, n_df, e_df], keys)]
).reset_index()
# pivot to wide
plot_df = (
    plot_df.pivot(index='id', columns='Player')
        .reindex(keys)
        .swaplevel(0, 1, 1)
        .sort_index(level=0, axis=1)
        .rename_axis(columns=["Player", 'Score Type'])
)

fig, ax = plt.subplots()
plot_df.plot(kind='bar', rot=0, ax=ax, legend=False, xlabel='', ylabel='Score')

players = plot_df.columns.get_level_values(0).unique()
score_types = plot_df.columns.get_level_values(1).unique()
hatches = np.tile(np.repeat(['/', '.'], plot_df.shape[0]), len(players))
colours = np.repeat(['red', 'green', 'blue', 'orange', 'pink'],
                    len(score_types) * plot_df.shape[0])

for patch, colour, hatch in zip(ax.patches, colours, hatches):
    patch.set_facecolor(colour)
    patch.set_hatch(hatch)

ax.legend(loc=1)
plt.tight_layout()
plt.show()

plot 4 complete

关于pandas - 如何将多组列组合为 pandas 中的条形图,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68304175/

相关文章:

python - 在 python 中使用 unique 函数时保持顺序

python - 根据 Pandas DataFrame 中其他列的条件创建新列

python - 在 Mac OS X 上使用 Tkinter 和 Matplotlib 重复对话窗口

python - 填充 pandas 数据框中的缺失值

python - 如何区分 'PASS' 和 'FAIL' 作为 pandas 中的 bool 值?

python - 我怎样才能摆脱pylab直方图中的线

python - 如何将 Cartopy 与 gridspec 结合使用

python - 在同一轴python上绘制for循环内生成的多个图

python - Pandas : Merge 2 dataframe based on common column which contains dictionary

python - 日期标签相交