python - Pandas 数据框描述给出了 float 数据类型的平均值和标准差的 nan 值

当我使用 df.describe() or np.mean(df[col]) or df[col].mean() 时，我有一个包含 8M 行的数据框我得到nan作为输出。

但是，当我检查np.mean(df[col].values)时，这是工作。我能够得到平均值。没有nan该列中的值。我已经使用 df[col].isna().sum() and df[col].isnull().sum() 进行了测试

不确定如何重现该错误。

更新:

>>> df.head()
    col1        col2
0   2.289062    290
1   2.289062    290
2   2.289062    290
3   2.289062    290
4   2.289062    290

>>> df[col1].dtype
dtype('float16')

有办法调试或解决此错误吗？

>>> pd.__version__
'1.3.4'

最佳答案

我认为您需要将列转换为float64/float32，因为no hardware support for float16 on a typical processor :

df[col].astype('float64').mean()
df[col].astype('float32').mean()

关于python - Pandas 数据框描述给出了 float 数据类型的平均值和标准差的 nan 值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/74040860/

上一篇：node.js - 使用 NodeJs 向 Mongo 数据库多次插入相同的记录

下一篇：google-cloud-platform - 为 Google 服务帐户生成签名 JWT - 验证失败

Python:从多个字典中添加值并合并唯一键

python - groupby 和过滤 Pandas

python - 阅读文档构建 : cmake/swig are not available

python - Scrapy CrawlerRunner : Output missing

python - 子类内置列表

pandas - 类型错误 : Cannot convert value dtype ('<M8[ns]' ) to a TensorFlow DType

python - Matplotlib:从主轴映射值的次轴

python - 在日期之间更改 Pandas 系列中的值

python - 将多个 CSV 文件合并到 Python 电子表格的单独选项卡中