python - 我的 DataFrame 有 NaN 值但不应该

标签 python pandas nan

我似乎无法访问我拥有的第一行数据(非索引),所有其他数据都很好:

df = pd.read_csv('stock_conf_GT_50.csv')
df.head()

这里的数据看起来不错:

     close     eqId     date    IntDate expiry delta    ivMid   conf
0   37.380005   7   2008-01-02    39447    1    50  0.3850  0.8663
1   37.380005   7   2008-01-02    39447    1    90  0.5053  0.7876
2   36.960007   7   2008-01-03    39448    1    50  0.3915  0.8597
3   36.960007   7   2008-01-03    39448    1    90  0.5119  0.7438
4   35.179993   7   2008-01-04    39449    1    50  0.4055  0.8454

列名看起来也不错:

df.columns
Index([' close', 'eqId', 'date', 'IntDate', 'expiry', 'delta', 'ivMid',
   'conf'],
  dtype='object')

我可以看到一些数据:

df['eqId'].head()
0    7
1    7
2    7
3    7
4    7
Name: eqId, dtype: int64

但不是第一个(非索引)列:

df['close'].head()

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-118-f7ce330a88a7> in <module>()
----> 1 df['close'].head()

C:\Users\camcompco\AppData\Roaming\Python\Python34\site-       packages\pandas\core\frame.py in __getitem__(self, key)
   1789             return self._getitem_multilevel(key)
   1790         else:
-> 1791             return self._getitem_column(key)
   1792 
   1793     def _getitem_column(self, key):

C:\Users\camcompco\AppData\Roaming\Python\Python34\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   1796         # get column
   1797         if self.columns.is_unique:
-> 1798             return self._get_item_cache(key)
   1799 
   1800         # duplicate columns & possible reduce dimensionaility

    C:\Users\camcompco\AppData\Roaming\Python\Python34\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   1082         res = cache.get(item)
   1083         if res is None:
-> 1084             values = self._data.get(item)
   1085             res = self._box_item_values(item, values)
   1086             cache[item] = res

C:\Users\camcompco\AppData\Roaming\Python\Python34\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   2849 
   2850             if not isnull(item):
-> 2851                 loc = self.items.get_loc(item)
   2852             else:
   2853                 indexer = np.arange(len(self.items))   [isnull(self.items)]

C:\Users\camcompco\AppData\Roaming\Python\Python34\site-packages\pandas\core\index.py in get_loc(self, key, method)
   1576         """
   1577         if method is None:
-> 1578             return self._engine.get_loc(_values_from_object(key))
   1579 
   1580         indexer = self.get_indexer([key], method=method)

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:3811)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:3691)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item  (pandas\hashtable.c:12336)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12287)()

KeyError: 'close'

这是我运行这段代码时得到的结果:

DataFrame(df,columns=['close','ivMid','eqId'],index=None)

    close   ivMid   eqId
    0   NaN 0.3850  7
    1   NaN 0.5053  7
    2   NaN 0.3915  7
    3   NaN 0.5119  7
    4   NaN 0.4055  7
    5   NaN 0.5183  7
    6   NaN 0.4464  7
    7   NaN 0.5230  7
    8   NaN 0.4453  7
    9   NaN 0.4826  7
    10  NaN 0.5668  7

最佳答案

在Index中可以看到close之前有一个空格:

Index([' close', 'eqId', 'date', 'IntDate', 'expiry', 'delta', 'ivMid',

因此在尝试访问“关闭”列时出现KeyError
您必须通过 df['close'] 访问它。

另一种方法是对列应用 strip 以确保它们没有前导空格:

df.index = df.index.map(lambda x: x.strip())

关于python - 我的 DataFrame 有 NaN 值但不应该,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30420263/

相关文章:

python - Image.open() 给出纯白色图像

python - 如何抵消 Pandas Pearson 与日期时间索引的相关性

python - 将字符串持续时间列转换为以小时和分钟为单位的时间

python - 每年用 Pandas 绘制箱线图

python - 计算 df 的平均值,但如果 =>1 个值与该平均值相差 >20%,则平均值设置为 NaN

javascript - 使用ES6在没有map的情况下增加对象属性

python - 如何从matplotlib条形图中获取数据

python - Unpickle 有时会生成空白对象

Python pandas 给出逗号分隔值新列

lua - Lua中#QNAN和#IND有什么区别