我有一个多索引 DataFrame,看起来像下面的数据。当我绘制数据时,图表如下所示。
如何绘制条形图,其中条形的颜色由我想要的类别(例如:“城市”)决定。因此,无论年份如何,属于同一城市的所有条形图都具有相同的颜色。例如:在下图中,所有 ATL 条都应为红色,而所有 MIA 条均应为蓝色。
City ATL MIA \
Year 2010 2011 2012 2010 2011
Taste
Bitter 3159.861983 3149.806667 2042.348937 3124.586470 3119.541240
Sour 1078.897032 3204.689424 3065.818991 2084.322056 2108.568495
Spicy 5280.847114 3134.597728 1015.311288 2036.494136 1001.532560
Sweet 1056.169267 1015.368646 4217.145165 3134.734027 4144.826118
City
Year 2012
Taste
Bitter 1070.925695
Sour 3178.131540
Spicy 3164.382635
Sweet 3173.919338
下面是我的代码:
import sys
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import random
matplotlib.style.use('ggplot')
def main():
taste = ['Sweet','Spicy','Sour','Bitter']
store = ['Asian','Italian','American','Greek','Mexican']
df1 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
'Store':[random.choice(store) for x in range(10)],
'Sold':1000+100*np.random.rand(10)})
df2 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
'Store':[random.choice(store) for x in range(10)],
'Sold':1000+100*np.random.rand(10)})
df3 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
'Store':[random.choice(store) for x in range(10)],
'Sold':1000+100*np.random.rand(10)})
df4 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
'Store':[random.choice(store) for x in range(10)],
'Sold':1000+100*np.random.rand(10)})
df5 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
'Store':[random.choice(store) for x in range(10)],
'Sold':1000+100*np.random.rand(10)})
df6 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
'Store':[random.choice(store) for x in range(10)],
'Sold':1000+100*np.random.rand(10)})
df1['Year'] = '2010'
df1['City'] = 'MIA'
df2['Year'] = '2011'
df2['City'] = 'MIA'
df3['Year'] = '2012'
df3['City'] = 'MIA'
df4['Year'] = '2010'
df4['City'] = 'ATL'
df5['Year'] = '2011'
df5['City'] = 'ATL'
df6['Year'] = '2012'
df6['City'] = 'ATL'
DF = pd.concat([df1,df2,df3,df4,df5,df6])
DFG = DF.groupby(['Taste', 'Year', 'City'])
DFGSum = DFG.sum().unstack(['Year','City']).sum(axis=1,level=['City','Year'])
print DFGSum
'''
In my plot, I want the color of the bars to be determined by the "City".
For example: All "ATL" bar colors will be the same regardless of the year.
'''
DFGSum.plot(kind='bar')
plt.show()
if __name__ == '__main__':
main()
最佳答案
编辑以包括颜色循环和任意数量的城市
您将需要指定一些额外的参数以使其看起来不错,但像这样的东西可能会起作用
import itertools # for color cycling
# specify the colors you want for each city
color_cycle = itertools.cycle( plt.rcParams['axes.color_cycle'] )
colors = { cty:color_cycle.next() for cty in DF.City.unique() }
#spcify the relative position of each bar
n = len(list(DFGSum))
positions = linspace(-n/2., n/2., n)
# plot each column individually
for i,col in enumerate(list(DFGSum)):
c = colors[col[0]]
pos = positions[i]
DFGSum[col].plot(kind='bar', color=c,
position=pos, width=0.05)
plt.legend()
plt.show()
虽然在这里您无法分辨哪个柱对应于哪一年...
替代方案
您还可以制作一种稍微不同的绘图,它在刻度标签中保留年份信息。这可推广到任意数量的城市,并将保持默认颜色样式
df = DFG.sum().reset_index().set_index(['Taste','Year'])
u_cty = df.City.unique() #array(['ATL', 'MIA'], dtype=object)
df_list = []
for cty in u_cty:
d = df.loc[ df.City==cty ]
d = d[['Sold']].rename(columns={'Sold':cty}).reset_index()
df_list.append(d)
df_merged = reduce(lambda left, right: pandas.merge(left, right, on=['Taste','Year'], how='outer'), df_list ) # merge the dataframes
df_merged.set_index(['Taste','Year'], inplace=True)
ATL MIA
Taste Year
Bitter 2010 3211.239754 2070.907629
2011 2158.068222 2145.373251
2012 2138.624730 1062.306874
Sour 2010 4188.024600 NaN
2011 4323.003409 NaN
2012 1042.772615 2136.742869
Spicy 2010 1018.737977 3155.450265
2012 4171.954201 2096.569762
Sweet 2010 2098.679545 5324.078957
2011 4215.376670 2115.964824
2012 3152.998667 5277.410536
Spicy 2011 NaN 6295.032147
df_merged.plot(kind='bar')
关于python - 绘制多索引 DataFrame 条形图,其中颜色由类别决定,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31949769/