python - Matplotlib 饼图标签与值不匹配

标签 python pandas matplotlib bar-chart pie-chart

我正在做这个 https://www.kaggle.com/edqian/twitter-climate-change-sentiment-dataset .
我已经将情感从数字转换为其字符描述(即 0 将是中性,1 将是 Pro,-1 将是反)

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

tweets_df = pd.read_csv('twitter_sentiment_data.csv')

tweets_df.loc[tweets_df['sentiment'] == 0, 'twt_sentiment'] = 'Neutral'
tweets_df.loc[tweets_df['sentiment'] == -1, 'twt_sentiment'] = 'Anti'
tweets_df.loc[tweets_df['sentiment'] == 1, 'twt_sentiment'] = 'Pro'

tweets_df = tweets_df.drop(['sentiment'], axis=1) 

# display(tweets_df.head())
                                                                                                                                              message             tweetid twt_sentiment
0           @tiniebeany climate change is an interesting hustle as it was global warming but the planet stopped warming for 15 yes while the suv boom  792927353886371840          Anti
1  RT @NatGeoChannel: Watch #BeforeTheFlood right here, as @LeoDiCaprio travels the world to tackle climate change https://toco/LkDehj3tNn htt…  793124211518832641           Pro
2                               Fabulous! Leonardo #DiCaprio's film on #climate change is brilliant!!! Do watch. https://toco/7rV6BrmxjW via @youtube  793124402388832256           Pro
3     RT @Mick_Fanning: Just watched this amazing documentary by leonardodicaprio on climate change. We all think this… https://toco/kNSTE8K8im  793124635873275904           Pro
4         RT @cnalive: Pranita Biswasi, a Lutheran from Odisha, gives testimony on effects of climate change & natural disasters on the po…  793125156185137153           NaN
我想创建一个带有子图的图表,以显示值(value)和百分比的情绪。我试过的代码:
sns.set(font_scale=1.5)
style.use("seaborn-poster")

fig, axes = plt.subplots(1, 2, figsize=(20, 10), dpi=100)

sns.countplot(tweets_df["twt_sentiment"], ax=axes[0])
labels = list(tweets_df["twt_sentiment"].unique())

axes[1].pie(tweets_df["twt_sentiment"].value_counts(),
            autopct="%1.0f%%",
            labels=labels,
            startangle=90,
            explode=tuple([0.1] * len(labels)))

fig.suptitle("Distribution of Tweets", fontsize=20)
plt.show()
结果不是我想要的,因为饼图标签是错误的。
pie chart with wrong labelling
在 value_counts 中使用 sort=False 后,饼图如下所示:
after sort=False

最佳答案

  • labels = list(tweets_df["twt_sentiment"].unique())标签的排列顺序与 tweets_df.twt_sentiment.value_counts() 的索引不同.索引决定切片顺序。因此,最好使用 .value_counts()索引作为标签。
  • 标签可以很容易地添加到条形图中,那么饼图是不必要的。

  • import pandas as pd
    import matplotlib.pyplot as plt
    
    tweets_df = pd.read_csv('data/kaggle/twitter_climate_change_sentiment/twitter_sentiment_data.csv')
    
    tweets_df.loc[tweets_df['sentiment'] == -1, 'twt_sentiment'] = 'Anti'
    tweets_df.loc[tweets_df['sentiment'] == 1, 'twt_sentiment'] = 'Pro'
    tweets_df.loc[tweets_df['sentiment'] == 0, 'twt_sentiment'] = 'Neutral'
    
    # assign value_counts to a variable; this is a pandas.Series
    vc = tweets_df.twt_sentiment.value_counts()
    
    # assign the value_counts index as the labels
    labels = vc.index
    
    # custom colors
    colors = ['tab:blue', 'tab:orange', 'tab:green']
    
    fig, axes = plt.subplots(1, 2, figsize=(10, 5), dpi=100)
    
    # plot the pandas.Series directly with pandas.Series.plot
    p1 = vc.plot(kind='bar', ax=axes[0], color=colors, rot=0, xlabel='Tweet Sentiment', width=.75)
    
    # add count label
    axes[0].bar_label(p1.containers[0], label_type='center')
    
    # add percent labels
    blabels = [f'{(v / vc.sum())*100:0.0f}%' for v in vc]
    axes[0].bar_label(p1.containers[0], labels=blabels, label_type='edge')
    
    # make space at the top of the bar plot
    axes[0].margins(y=0.1)
    
    # add the pie plot
    axes[1].pie(vc, labels=labels, autopct="%1.0f%%", startangle=90, explode=tuple([0.1] * len(labels)), colors=colors)
    
    fig.suptitle("Distribution of Tweets", fontsize=20)
    plt.show()
    
    enter image description here

    关于python - Matplotlib 饼图标签与值不匹配,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69024302/

    相关文章:

    python - 使用 matplotlib 显示六边形网格

    python - 将 AdditiveGaussianNoise 添加到单个图像 - AssertionError : Expected boolean as argument for 'return_batch'

    python - 如何在 OpenCV Python 中识别图像中的不同对象

    python - Flask 项目在本地环境中工作,在 Heroku 上抛出关于丢失文件的奇怪错误(无论如何都是动态添加的!)

    Python Pandas 根据标题值匹配 Vlookup 列

    python - 从具有多个字符串的列制作 get_dummies 类型数据框的最快方法

    python - 创建一个距离中心有欧氏距离的二维 Numpy 数组

    python - 如何解决 KeyError(f"None of [{key}] are in the [{axis_name}]") 在这种情况下(Pandas)?

    python - Matplotlib 返回空图

    python - Matplotlib - 更改自动轴范围