python - 在python中的散点图上绘制所有字典点

我有一个包含簇的字典，每个簇包含不同的标签

Dictonary look like this
demo_dict = {0: [b'3.0',b'3.0', b'3.0', b'5.0',b'5.0',b'5.0', b'6.0', b'6.0'],
 1: [b'2.0', b'2.0', b'3.0', b'7.0',b'7.0'],
 2: [b'1.0', b'4.0', b'8.0', b'7.0',b'7.0']}

要绘制所需的图，我使用以下代码

comp = demo_dict
df = pd.DataFrame.from_dict(comp, orient='index')
df.index.rename('Clusters', inplace=True)

stacked = df.stack().reset_index()
stacked.rename(columns={'level_1': 'Lable', 0: 'Labels'}, inplace=True)

sns.scatterplot(data=stacked, x='Clusters', y='Labels')
plt.show()

但问题是，上面的代码并没有画出所有的点，它只是提到了哪些集群包含哪些标签，但我想在视觉上拥有每个集群的所有点。

是，这段代码中缺少一些东西来生成所有点注意:我也尝试过 stripplot 和 swarmplot

最佳答案

使用 groupby 您可以使用两列进行分组。然后可以通过热图显示计数:

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

demo_dict = {}
for i in range(40):
    demo_dict[i] = np.random.choice([b'1.0', b'2.0', b'3.0', b'4.0', b'5.0', b'6.0', b'7.0', b'8.0'],
                                    np.random.randint(10, 30))
df = pd.DataFrame.from_dict(demo_dict, orient='index')
df.index.rename('Clusters', inplace=True)

stacked = df.stack().reset_index()
stacked.rename(columns={'level_1': 'Lable', 0: 'Labels'}, inplace=True)

grouped = stacked.groupby(['Labels', 'Clusters']).agg('count').unstack()

fig = plt.figure(figsize=(15, 4))
ax = sns.heatmap(data=grouped, annot=True, cmap='rocket_r', cbar_kws={'pad': 0.01})
ax.set_xlabel('')
ax.tick_params(axis='y', labelrotation=0)
plt.tight_layout()
plt.show()

另一种方法是将计数显示为散点图中的大小

grouped = stacked.groupby(['Labels', 'Clusters']).agg('count').reset_index()
fig = plt.figure(figsize=(15, 4))
ax = sns.scatterplot(data=grouped, x='Clusters', y='Labels', size='Lable', color='orchid')
for h in ax.legend_.legendHandles:
    h.set_color('orchid')  # the default color in the sizes legends is black
ax.margins(x=0.01) # less whitespace
# set the legend outside
ax.legend(handles=ax.legend_.legendHandles, title='Counts:', bbox_to_anchor=(1.01, 1.02), loc='upper left')

您也可以尝试 How to make jitterplot on matplolib python 中的方法，可选择在 x 和 y 方向使用不同的抖动偏移。使用您的数据，它可能如下所示:

def jitter_dots(dots):
    offsets = dots.get_offsets()
    jittered_offsets = offsets
    jittered_offsets[:, 0] += np.random.uniform(-0.3, 0.3, offsets.shape[0]) # x
    jittered_offsets[:, 1] += np.random.uniform(-0.3, 0.3, offsets.shape[0]) # y
    dots.set_offsets(jittered_offsets)

ax = sns.scatterplot(data=stacked, x='Clusters', y='Labels')
jitter_dots(ax.collections[0])

这是 8 种不同颜色的外观，每个集群交替显示:

ax = sns.scatterplot(data=stacked, x='Clusters', y='Labels',
                     hue=stacked['Clusters'] % 8, palette='Dark2', legend=False)
jitter_dots(ax.collections[0])
ax.margins(x=0.02)
sns.despine()

关于python - 在python中的散点图上绘制所有字典点，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66838712/

python - 在python中的散点图上绘制所有字典点

上一篇：asp.net-core - 无法弄清楚是否可以在 Hot Chocolate 中为 ASP.NET Core 使用多个架构

下一篇：python - 从存储为字符串的文件路径中获取文件名