我想要一个 Matplotlib 中类似于 Matlab 'scatterhist' function 的函数它采用 'x' 和 'y' 轴的连续值,加上一个分类变量作为输入;并生成一个散点图,其中包含边缘 KDE 图和两个或多个不同颜色的分类变量作为输出:
我用 marginal histograms in Matplotlib 找到了散点图的例子, marginal histograms in Seaborn jointplot , overlapping histograms in Matplotlib和 marginal KDE plots in Matplotib ;但我还没有找到任何将散点图与边际 KDE 图相结合并用颜色编码以指示不同类别的示例。
如果可能的话,我想要一个使用“vanilla”Matplotlib 而不使用 Seaborn 的解决方案,因为这将避免依赖性并允许使用标准 Matplotlib 命令完全控制和自定义绘图外观。
我打算尝试根据上面的例子写一些东西;但在此之前想检查是否已经有类似的功能可用,如果没有,那么将不胜感激有关最佳使用方法的任何指导。
最佳答案
@ImportanceOfBeingEarnest:非常感谢您的帮助。
这是我第一次尝试解决方案。
它有点hacky但达到了我的目标,并且可以使用标准的matplotlib命令完全自定义。我在这里发布带有注释的代码,以防其他人希望使用它或进一步开发它。如果有任何改进或更简洁的代码编写方式,我总是热衷于学习,并感谢您的指导。
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec
from scipy import stats
label = ['Setosa','Versicolor','Virginica'] # List of labels for categories
cl = ['b','r','y'] # List of colours for categories
categories = len(label)
sample_size = 20 # Number of samples in each category
# Create numpy arrays for dummy x and y data:
x = np.zeros(shape=(categories, sample_size))
y = np.zeros(shape=(categories, sample_size))
# Generate random data for each categorical variable:
for n in range (0, categories):
x[n,:] = np.array(np.random.randn(sample_size)) + 4 + n
y[n,:] = np.array(np.random.randn(sample_size)) + 6 - n
# Set up 4 subplots as axis objects using GridSpec:
gs = gridspec.GridSpec(2, 2, width_ratios=[1,3], height_ratios=[3,1])
# Add space between scatter plot and KDE plots to accommodate axis labels:
gs.update(hspace=0.3, wspace=0.3)
# Set background canvas colour to White instead of grey default
fig = plt.figure()
fig.patch.set_facecolor('white')
ax = plt.subplot(gs[0,1]) # Instantiate scatter plot area and axis range
ax.set_xlim(x.min(), x.max())
ax.set_ylim(y.min(), y.max())
ax.set_xlabel('x')
ax.set_ylabel('y')
axl = plt.subplot(gs[0,0], sharey=ax) # Instantiate left KDE plot area
axl.get_xaxis().set_visible(False) # Hide tick marks and spines
axl.get_yaxis().set_visible(False)
axl.spines["right"].set_visible(False)
axl.spines["top"].set_visible(False)
axl.spines["bottom"].set_visible(False)
axb = plt.subplot(gs[1,1], sharex=ax) # Instantiate bottom KDE plot area
axb.get_xaxis().set_visible(False) # Hide tick marks and spines
axb.get_yaxis().set_visible(False)
axb.spines["right"].set_visible(False)
axb.spines["top"].set_visible(False)
axb.spines["left"].set_visible(False)
axc = plt.subplot(gs[1,0]) # Instantiate legend plot area
axc.axis('off') # Hide tick marks and spines
# Plot data for each categorical variable as scatter and marginal KDE plots:
for n in range (0, categories):
ax.scatter(x[n],y[n], color='none', label=label[n], s=100, edgecolor= cl[n])
kde = stats.gaussian_kde(x[n,:])
xx = np.linspace(x.min(), x.max(), 1000)
axb.plot(xx, kde(xx), color=cl[n])
kde = stats.gaussian_kde(y[n,:])
yy = np.linspace(y.min(), y.max(), 1000)
axl.plot(kde(yy), yy, color=cl[n])
# Copy legend object from scatter plot to lower left subplot and display:
# NB 'scatterpoints = 1' customises legend box to show only 1 handle (icon) per label
handles, labels = ax.get_legend_handles_labels()
axc.legend(handles, labels, scatterpoints = 1, loc = 'center', fontsize = 12)
plt.show()`
`
关于matplotlib - Matplotlib 中带有边缘 KDE 图和多个类别的散点图,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57267150/