我正在尝试使用 pd.to_numeric 转换列,但出于某种原因,它会将所有值(一个除外)转换为 NaN:
In[]: pd.to_numeric(portfolio["Principal Remaining"],errors="coerce")
Out[]:
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
15 NaN
16 NaN
17 NaN
18 836.61
19 NaN
20 NaN
...
Name: Principal Remaining, Length: 32314, dtype: float64
想知道为什么会这样吗?原始数据如下所示:
1 18,052.02
2 27,759.85
3 54,061.75
4 89,363.61
5 46,954.46
6 64,295.64
7 100,000.00
8 27,905.98
9 13,821.48
10 16,937.89
...
Name: Principal Remaining, Length: 32314, dtype: object
最佳答案
read_csv
与 thousands=','
df = pd.read_csv('file.csv', thousands=',')
这解决了读取数据时的问题。
replace
和to_numeric
df['Principal Remaining'] = pd.to_numeric(
df['Principal Remaining'].str.replace(',', ''), errors='coerce')
如果第一个选项不是一个选择,您需要先使用 str.replace
去掉逗号,然后调用 pd. to_numeric
如此处所示。
关于python - pd.to_numeric 将整个系列转换为 NaN,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48122696/