我有一个数据框,我正在尝试在其上实现功能选择。 有 45 列类型,整数、浮点和对象。
但是我无法拟合任何特征选择模型,因为它抛出了错误。请帮帮我
数据框:
member_id loan_amnt funded_amnt funded_amnt_inv term batch_enrolled int_rate grade
58189336 14350 14350 14350 36 months 19.19 E
70011223 4800 4800 4800 36 months BAT1586599 10.99 B
sub_grade emp_title emp_length home_ownership annual_inc verification_status pymnt_plan desc purpose title zip_code addr_state dti
E3 clerk 9 years OWN 28700 Source Verified n debt_consolidation Debt consolidation 349xx FL 33.88
B4 HR < 1 year MORTGAGE 65000 Source Verified n home_improvement Home improvement 209xx MD 3.64
last_week_pay loan_status
44th week 0
9th week 1
代码:
import numpy
from pandas import read_csv
from sklearn.decomposition import PCA
# load data
df = pd.read_csv("C:/Users/anagha/Documents/Python Scripts/train_indessa.csv")
array = df.values
X = array[:,0:44]
Y = array[:,44]
# feature extraction
pca = PCA(n_components=3)
fit = pca.fit(X)
错误:
Traceback (most recent call last):
File "<ipython-input-8-20f3863fd66e>", line 2, in <module>
fit = pca.fit(X)
File "C:\Users\anagha\Anaconda3\lib\site- packages\sklearn\decomposition\pca.py", line 301, in fit
self._fit(X)
File "C:\Users\anagha\Anaconda3\lib\site-packages\sklearn\decomposition\pca.py", line 333, in _fit
copy=self.copy)
File "C:\Users\anagha\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 382, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: could not convert string to float: '44th week'
最佳答案
无法将“44th week”之类的字符串转换为浮点型。
字符串中Python实际上可以转换的唯一部分是44。为了做到这一点,我建议更改字符串以保留唯一的数字。之后,您将轻松应用 sklearn fit。以下代码应显示如何让 np 数组准备好转换为浮点型。
import numpy as np
import pandas as pd
data = np.array([['rows','col1','Col2','Col_withtext'],
['Row1',1,2,'44th week'],
['Row2',3,4,'the 30th week']])
df = pd.DataFrame(data=data[1:,1:],
index=data[1:,0],
columns=data[0,1:])
使用pandas替换删除文本
df['Col_withtext'].replace(to_replace="[a-zA-Z]", value='',
regex=True, inplace=True)
df.values
##打印出来
array([['1', '2', '44 '],
['3', '4', ' 30 ']], dtype=object)
让我知道进展如何!
关于python-3.x - ValueError:无法将字符串转换为 float ,Python,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42920168/