python - 相关热图在 Python 中将值转换为 nan

标签 python pandas dataframe seaborn nan

我想在我的表 df 上做一个热图,一开始看起来很正常:

    Total   Paid Post Engaged   Negative    like 
1   2178    0    0    66        0           1207
2   1042    0    0    60        0           921
3   2096    0    0    112       0           1744
4   1832    0    0    109       0           1718
5   1341    0    0    38        0           889
6   1933    0    0    123       0           1501
    ...

但在我申请之后:

df= full_Data.iloc[1:,4:10]
df= pd.DataFrame(df,columns=['A','B','C', 'D', 'E', 'F'])

corrMatrix = df.corr()
sn.heatmap(corrMatrix, annot=True)
plt.show()

它返回了一个空图:

C:\Users\User\Anaconda3\lib\site-packages\seaborn\matrix.py:204: RuntimeWarning: All-NaN slice encountered
  vmin = np.nanmin(calc_data)
C:\Users\User\Anaconda3\lib\site-packages\seaborn\matrix.py:209: RuntimeWarning: All-NaN slice encountered
  vmax = np.nanmax(calc_data)

enter image description here

df返回:

    A   B   C   D   E   F
1   nan nan nan nan nan nan
2   nan nan nan nan nan nan
3   nan nan nan nan nan nan
4   nan nan nan nan nan nan
5   nan nan nan nan nan nan
    ...

为什么所有的值都变成了nan


更新:

尝试转换 df 而不用旧方法命名列:

df.columns = ['A','B','C', 'D', 'E', 'F']

df= pd.DataFrame(df.to_numpy(),columns=['A','B','C', 'D', 'E', 'F'])

并且都捕获到错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-43-3a27f095066b> in <module>
     12 
     13 corrMatrix = df.corr()
---> 14 sn.heatmap(corrMatrix, annot=True)
     15 plt.show()
     16 

~\Anaconda3\lib\site-packages\seaborn\_decorators.py in inner_f(*args, **kwargs)
     44             )
     45         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46         return f(**kwargs)
     47     return inner_f
     48 

~\Anaconda3\lib\site-packages\seaborn\matrix.py in heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, xticklabels, yticklabels, mask, ax, **kwargs)
    545     plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt,
    546                           annot_kws, cbar, cbar_kws, xticklabels,
--> 547                           yticklabels, mask)
    548 
    549     # Add the pcolormesh kwargs here

~\Anaconda3\lib\site-packages\seaborn\matrix.py in __init__(self, data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, cbar, cbar_kws, xticklabels, yticklabels, mask)
    164         # Determine good default values for the colormapping
    165         self._determine_cmap_params(plot_data, vmin, vmax,
--> 166                                     cmap, center, robust)
    167 
    168         # Sort out the annotations

~\Anaconda3\lib\site-packages\seaborn\matrix.py in _determine_cmap_params(self, plot_data, vmin, vmax, cmap, center, robust)
    202                 vmin = np.nanpercentile(calc_data, 2)
    203             else:
--> 204                 vmin = np.nanmin(calc_data)
    205         if vmax is None:
    206             if robust:

<__array_function__ internals> in nanmin(*args, **kwargs)

~\Anaconda3\lib\site-packages\numpy\lib\nanfunctions.py in nanmin(a, axis, out, keepdims)
    317         # Fast, but not safe for subclasses of ndarray, or object arrays,
    318         # which do not implement isnan (gh-9009), or fmin correctly (gh-8975)
--> 319         res = np.fmin.reduce(a, axis=axis, out=out, **kwargs)
    320         if np.isnan(res).any():
    321             warnings.warn("All-NaN slice encountered", RuntimeWarning,

ValueError: zero-size array to reduction operation fmin which has no identity

最佳答案

我认为问题是将对象 DataFrame 传递给 pd.DataFrame 构造函数,因此列表中有不同的原始列名和新列名,所以只有 NaN 已创建。

解决方案是将其转换为 numpy 数组:

df= pd.DataFrame(df.to_numpy(),columns=['A','B','C', 'D', 'E', 'F'])

或者在没有 DataFrame 构造函数的情况下在下一步中设置新的列名:

df = full_Data.iloc[1:,4:10]
df.columns = ['A','B','C', 'D', 'E', 'F']

仅通过现有列创建 dict 的解决方案:

old = df.columns
new = ['A','B','C', 'D', 'E', 'F']

df = df.rename(columns=dict(zip(old, new)))
print (df)
      A  B  C    D  E     F
1  2178  0  0   66  0  1207
2  1042  0  0   60  0   921
3  2096  0  0  112  0  1744
4  1832  0  0  109  0  1718
5  1341  0  0   38  0   889
6  1933  0  0  123  0  1501

print (df.corr())
          A   B   C         D   E         F
A  1.000000 NaN NaN  0.606808 NaN  0.727034
B       NaN NaN NaN       NaN NaN       NaN
C       NaN NaN NaN       NaN NaN       NaN
D  0.606808 NaN NaN  1.000000 NaN  0.916325
E       NaN NaN NaN       NaN NaN       NaN
F  0.727034 NaN NaN  0.916325 NaN  1.000000

编辑:

问题是列不是数字。

df = df.astype(int)

或者:

df = df.apply(pd.to_numeric, errors='coerce')

关于python - 相关热图在 Python 中将值转换为 nan,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68267296/

相关文章:

python - 如何根据频率对文本列进行分类

python - Pyspark:json对象中的rdd

python - 静音 music21 警告

python - 在异步 View 中连接到自身的 Django 3.1 asgi 服务器超时

python - 在 python 中步进左行值直到不为空

python - 如何根据数据框中另一列的值将数据输入到新列中?

python - SQL 中是否有可选的预准备语句用于不同特定级别的查询

Python Pandas 子集十六进制字符串,转换为十进制

Python - Pandas - 将特定函数应用于给定级别 - 多索引数据帧

python - 如何将多个 pandas 数据框列汇总为父列名称?