python - 在 TruncatedSVD Python 之后绘制 K-means 集群

标签 python matplotlib scikit-learn k-means svd

我正在尝试绘制在我的数据集上运行聚类的结果,但出现错误:

  File "cluster.py", line 93, in <module>
    Z = kmeans.predict(np.c_[xx.ravel(), yy.ravel()])
  File "/usr/local/lib/python2.7/dist-packages/sklearn/cluster/k_means_.py", line 957, in predict
    X = self._check_test_data(X)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/cluster/k_means_.py", line 867, in _check_test_data
    n_features, expected_n_features))
ValueError: Incorrect number of features. Got 2 features, expected 73122

我对 fit() 的调用工作正常,但绘图是错误的地方。

这是我的代码:

reduced_data = TruncatedSVD(n_components=2).fit_transform(X)

kmeans = KMeans(n_clusters=4, init='k-means++', max_iter=100, n_init=1, verbose=False)
kmeans.fit(X)

h = .02     # point in the mesh [x_min, x_max]x[y_min, y_max].

# Plot the decision boundary. For that, we will assign a color to each
x_min, x_max = reduced_data[:, 0].min() - 1, reduced_data[:, 0].max() + 1
y_min, y_max = reduced_data[:, 1].min() - 1, reduced_data[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

# Obtain labels for each point in mesh. Use last trained model.
Z = kmeans.predict(np.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.figure(1)
plt.clf()
plt.imshow(Z, interpolation='nearest',
           extent=(xx.min(), xx.max(), yy.min(), yy.max()),
           cmap=plt.cm.Paired,
           aspect='auto', origin='lower')

plt.plot(reduced_data[:, 0], reduced_data[:, 1], 'k.', markersize=2)
# Plot the centroids as a white X
centroids = kmeans.cluster_centers_
plt.scatter(centroids[:, 0], centroids[:, 1],
            marker='x', s=169, linewidths=3,
            color='w', zorder=10)
plt.title('K-means clustering on the digits dataset (PCA-reduced data)\n'
          'Centroids are marked with white cross')
plt.xlim(x_min, x_max)
plt.ylim(y_min, y_max)
plt.xticks(())
plt.yticks(())
plt.show()

谁能建议我如何更改我的代码以获得集群图?

最佳答案

回溯告诉你问题是什么:

ValueError: Incorrect number of features. Got 2 features, expected 73122

kmeans 分类器适合 73122 维训练样本,因此您不能使用 kmeans2< 进行预测-维测试样本。

要修复您的代码,只需将 kmeans.fit(X) 更改为 kmeans.fit(reduced_data)

关于python - 在 TruncatedSVD Python 之后绘制 K-means 集群,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42882207/

相关文章:

python - 如何将带有表情符号名称的字符串转换为不和谐的表情符号?

Python seaborn 与散点图和 Pandas 的错误

python - 如何循环遍历 Pandas 数据框

Python 多处理工具 vs Py(Spark)

python - 字典分配使 Cython 崩溃

python Flask网站错误: the requested URL was not found on the server

python - 如何获取列的名称或更改现有列的名称?

python - 在条形图上格式化日期标签

python - 无法从 Windows10 上的 'ft2font' 导入名称 'matplotlib'

python - 生成 'K' 数据点的最近邻