python - PCA 投影质心和省略号

标签 python pca projection

我目前正在攻读博士学位,我想知道使用 PCA 投影的人是否有显示更多信息的想法,R 中的某些库可以默认打印这些信息。请参阅 STHDA PCA Analysis 的示例

最好的方法是什么?

最佳答案

我本来打算寻求提示,但我自己找到了一些答案,可以在 Python 上产生相同的结果。

我所做的是:

def confidence_ellipse(x, y, ax, n_std=3.0, facecolor='none', **kwargs):
    """
    Create a plot of the covariance confidence ellipse of `x` and `y`

    Parameters
    ----------
    x, y : array_like, shape (n, )
        Input data.

    ax : matplotlib.axes.Axes
        The axes object to draw the ellipse into.

    n_std : float
        The number of standard deviations to determine the ellipse's radiuses.

    Returns
    -------
    matplotlib.patches.Ellipse

    Other parameters
    ----------------
    kwargs : `~matplotlib.patches.Patch` properties
    """
    if x.size != y.size:
        raise ValueError("x and y must be the same size")

    cov = np.cov(x, y)
    pearson = cov[0, 1] / np.sqrt(cov[0, 0] * cov[1, 1])
    # Using a special case to obtain the eigenvalues of this
    # two-dimensionl dataset.
    ell_radius_x = np.sqrt(1 + pearson)
    ell_radius_y = np.sqrt(1 - pearson)
    ellipse = Ellipse((0, 0),
                      width=ell_radius_x * 2,
                      height=ell_radius_y * 2,
                      facecolor=facecolor,
                      **kwargs)

    # Calculating the stdandard deviation of x from
    # the squareroot of the variance and multiplying
    # with the given number of standard deviations.
    scale_x = np.sqrt(cov[0, 0]) * n_std
    mean_x = np.mean(x)

    # calculating the stdandard deviation of y ...
    scale_y = np.sqrt(cov[1, 1]) * n_std
    mean_y = np.mean(y)

    transf = transforms.Affine2D() \
        .rotate_deg(45) \
        .scale(scale_x, scale_y) \
        .translate(mean_x, mean_y)

    ellipse.set_transform(transf + ax.transData)
    return ax.add_patch(ellipse)


method = PCA(n_components=2, whiten=True)  # project to 2 dimensions
projected = method.fit_transform(np.array(inputs[tags['datum']].tolist()))

figure = pyplot.figure()
axis = figure.add_subplot(111)
# Display data
for label in labels:
  color = np.expand_dims(np.array(settings.get_color(label)), axis=0)
  pyplot.scatter(projected[labels == label, 0], projected[labels == label, 1],
                           c=color, alpha=0.5, label=label, edgecolor='none')

# Centroids
for label in labels:
# Centroids
color = np.array(settings.get_color(label))
# Ellipsis
Views.confidence_ellipse(projected[labels == label, 0], projected[labels == label, 1], axis,
                         edgecolor=color, linewidth=3, zorder=0)


The confidence_ellipse came from matplotlib example.

关于python - PCA 投影质心和省略号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61897320/

相关文章:

python - 为 Pandas 数据框的子集设置多个值

python selenium 无法清除输入字段

grails - grails服务功能的返回类型是什么?

hibernate 条件查询以仅获取特定列

python - 人脸识别-Python

arcgis - 如何从翻转或 tfw 文件中获取投影(epsg 编号)

python - 在 Python 中通过变量名执行函数

python - Pandas:将一个时间序列中的值应用于另一个时间序列的先前实例

python - 将大数据集 PCA 保存在磁盘上以供以后在磁盘空间有限的情况下使用

matlab - 如何使用 Matlab 确定 PCA 中变量的重要性?