python - 使用 Pandas value_counts() 添加 'rest' 组

标签 python pandas matplotlib

我刚刚开始使用 pandas 库来分析大型数据集。我正在分析具有属性 issuercountrycode 的信用卡数据，该属性由 117 种可能性组成。当尝试可视化数据集中使用的 issuercountrycode 时，我当前使用以下代码生成饼图。

df['issuercountrycode'].value_counts().plot(kind='pie')
plt.show()

这会产生以下饼图:

正如您所看到的，这并不理想，因为不经常使用多个值。 pandas 是否有可能在使用 value_counts() 函数时添加阈值，并将低于特定值的值添加到“其余”组中？这些类型的操作在 pandas 中是否可能？

最佳答案

你需要用 boolean indexing 来计数和总和:

tresh = 2
a = df['issuercountrycode'].value_counts()
b = a[a > tresh]
b['rest'] = a[a <= tresh].sum()

示例:

np.random.seed(10)
L = list('abcdef')
df = pd.DataFrame({'issuercountrycode':np.random.choice(L, size=15)})

tresh = 2
a = df['issuercountrycode'].value_counts()
b = a[a > tresh]
b['rest'] = a[a <= tresh].sum()
print (b)
b       5
f       3
a       3
rest    4
Name: issuercountrycode, dtype: int64

b.plot.pie()

关于python - 使用 Pandas value_counts() 添加 'rest' 组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43950837/

上一篇：Python 装饰器验证

下一篇：Python Pandas 系列元组数据框

python - 使用正则表达式捕获子字符串 python

python - Pandas 将第二个最小值分配给列

python - pd.melt() 字典/一系列数据帧

Python - 从其他类中的方法检索值

python - 如何让 Pandas 根据出现次数为值添加递增后缀

python - 条形图与散点图的颜色相同

python - 为多个时间步长值生成多个图

python - 在 matplotlib 中为标记和线条设置不同的颜色并在图例中显示

python - 使用 RubyPython 执行任意 Python 代码块