我尝试使用 SimpleImputer
均值策略来插补 NaN 值,但它不是插补它,而是删除了 NaN
值,我阅读了如何使用它
here和 the documentation ,它根本不适用于 numpy 数组或 python 列表,出了什么问题?解决办法是什么?
import numpy as np
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
X = np.array([[2,3,6,5,4, np.nan],[2,3,6,15,4, np.nan]])
SI = SimpleImputer(strategy='mean')
X = SI.fit_transform(X)
print(X)
输出
runfile('D:/python projects/untitled0.py', wdir='D:/python projects')
[[ 2. 3. 6. 5. 4.]
[ 2. 3. 6. 15. 4.]]
最佳答案
In [239]: SI=SimpleImputer(verbose=1)
In [240]: SI.fit_transform(X)
/usr/local/lib/python3.6/dist-packages/sklearn/impute/_base.py:403: UserWarning: Deleting features without observed values: [5]
"observed values: %s" % missing)
Out[240]:
array([[ 2., 3., 6., 5., 4.],
[ 2., 3., 6., 15., 4.]])
调整X:
In [241]: X = np.array([[2,3,6,5,4, np.nan],[2,3,6,15,np.nan, 4]])
In [242]: SI.fit_transform(X)
Out[242]:
array([[ 2., 3., 6., 5., 4., 4.],
[ 2., 3., 6., 15., 4., 4.]])
关于python - 简单的 imputer 删除 nan 而不是插补,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60407152/