python - Pandas:如何在 CSV 中查找引发错误的行: "ValueError: could not convert string to float"

我使用以下命令将 CSV 导入 Pandas 数据帧:

df=pandas.read_csv("import.csv", names=["Year", "Month", "Day", "Time", 
"ColA"], encoding='iso-8859-1')

但是 Pandas 导入 ColA 作为数据类型对象。

我尝试使用它将该列转换为 float :

df['ColA'] = df['ColA'].astype(float)

但是会引发此错误:

 ValueError: could not convert string to float:

这限制了我，因为我无法在对象类型的列上运行平均值、总和等 Pandas 函数(我需要能够)。在对象类型的数据帧中的列上运行类似的函数会返回错误:

DataError: No numeric types to aggregate

ColA 也包含负数。现在我想知道如何让 Spyder/Python/Pandas 告诉我哪一行具体引发了错误。换句话说，我如何找出哪一行包含 Python 解释为字符串的内容？

CSV 包含数十万行，因此仅通过在 Excel 中浏览 CSV 来查找字符串是没有希望的。非常感谢您的建议!

编辑

@Jon Clements 的以下评论成功地将列转换为 float64 类型。然而，为了处理这些“非数字故障值”，如果 Spyder/PythonIDE 控制台可以提供这些值，那么定位这些值的过程就会变得容易。 Python 返回引发错误的具体位置是有道理的。此外，它可以帮助节省大量查找这些文件的时间，尤其是在处理巨大的 CSV 文件时。

版本信息:

python: 3.6.3.final.0

python-bits: 64

pandas: 0.20.3

最佳答案

您尝试过 df['ColA'].astype('float64') 吗？

如果不起作用，请尝试:

df.apply(pd.to_numeric)

pd.to_numeric 存在关键字参数错误:

arg : list, tuple or array of objects, or Series errors : {'ignore', 'raise', 'coerce'}, default 'raise'
- If 'raise', then invalid parsing will raise an exception
- If 'coerce', then invalid parsing will be set as NaN
- If 'ignore', then invalid parsing will return the input

关于python - Pandas:如何在 CSV 中查找引发错误的行: "ValueError: could not convert string to float"，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48873457/

python - Pandas:如何在 CSV 中查找引发错误的行: "ValueError: could not convert string to float"

上一篇：python - Pandas 在非唯一值上自连接

下一篇：python - 多处理池map_async的意外行为