我有一个像这样的数据框:
| A | B | C | D |
|---|---|----|---|
| 1 | 3 | 10 | 4 |
| 2 | 3 | 1 | 5 |
| 1 | 7 | 9 | 3 |
其中 A B C D 是类别,值在 [1, 10] 范围内(某些值可能不会出现在单个列中)我想要一个数据框,为每个类别显示这些值的计数。像这样的东西:
| | A | B | C | D |
|----|---|----|---|---|
| 1 | 2 | 0 | 1 | 0 |
| 2 | 1 | 0 | 0 | 0 |
| 3 | 0 | 2 | 0 | 1 |
| 4 | 0 | 0 | 0 | 1 |
| 5 | 0 | 0 | 0 | 1 |
| 6 | 0 | 0 | 0 | 0 |
| 7 | 0 | 1 | 0 | 0 |
| 8 | 0 | 0 | 0 | 0 |
| 9 | 0 | 0 | 1 | 0 |
| 10 | 0 | 0 | 1 | 0 |
我尝试使用 groupby
和 pivot_table
但我似乎无法理解要提供哪些参数。
最佳答案
pandas.Series.value_counts
适用于每列seaborn.heatmap
将绘制 DataFrame
选项1
import seaborn as sns
import pandas as pd
# dataframe setup
data = {'A': [1, 2, 1], 'B': [3, 3, 7], 'C': [10, 1, 9], 'D': [4, 5, 3]}
df = pd.DataFrame(data)
# create a dataframe of the counts for each column
counts = df.apply(pd.value_counts)
# display(count)
A B C D
1 2.0 NaN 1.0 NaN
2 1.0 NaN NaN NaN
3 NaN 2.0 NaN 1.0
4 NaN NaN NaN 1.0
5 NaN NaN NaN 1.0
7 NaN 1.0 NaN NaN
9 NaN NaN 1.0 NaN
10 NaN NaN 1.0 NaN
# plot
sns.heatmap(counts)
选项 2
cmap
更改颜色。可以提高可视化。.fillna(0)
看起来不那么忙。 # counts
counts = df.apply(pd.value_counts).fillna(0)
# plot
sns.heatmap(counts, cmap="GnBu", annot=True)
默认颜色
sns.heatmap(counts, annot=True)
关于python - 每列中每个值的计数热图,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63757556/