python - pandas dataframe 选择 nan 索引

我有一个包含以下内容的数据框 df:

In [10]: df.index.unique()
Out[10]: array([u'DC', nan, u'BS', u'AB', u'OA'], dtype=object)

我可以轻松地选择 df.ix["DC"]、df.ix["BS"] 等。但是我在选择 nan 索引时遇到了问题。

df.ix[nan], df.ix["nan"], df.ix[np.nan] all won't work.

如何选择以nan为索引的行？

最佳答案

一种方法是使用 df.index.isnull() 来识别 NaN 的位置:

In [218]: df = pd.DataFrame({'Date': [0, 1, 2, 0, 1, 2], 'Name': ['A', 'B', 'C', 'A', 'B', 'C'], 'val': [0, 1, 2, 3, 4, 5]}, index=['DC', np.nan, 'BS', 'AB', 'OA', np.nan]); df
Out[218]: 
     Date Name  val
DC      0    A    0
NaN     1    B    1
BS      2    C    2
AB      0    A    3
OA      1    B    4
NaN     2    C    5

In [219]: df.index.isnull()
Out[219]: array([False,  True, False, False, False,  True], dtype=bool)

然后您可以使用 df.loc 选择这些行:

In [220]: df.loc[df.index.isnull()]
Out[220]: 
     Date Name  val
NaN     1    B    1
NaN     2    C    5

注意:我的原始答案使用 pd.isnull(df.index) 而不是 Zero's suggestion , df.index.isnull()。最好使用 df.index.isnull() 因为对于不能容纳 NaN 的索引类型，例如 Int64Index 和 RangeIndex， isnull 方法 returns an array of all False values immediately而不是无意识地检查索引中的每个项目是否有 NaN 值。

关于python - pandas dataframe 选择 nan 索引，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25536133/

上一篇：python - 使用 pelican-quickstart "No module named html_parser"时出错

下一篇：python - 如何在 OpenCV 中将 16 位图像转换为 8 位图像？

python - 无法通过分发将 pip 与 Python 3.2 一起使用

python - 根据另一个数据集中的元素位置过滤 pandas 数据帧的快速方法

python - 在最后 n 个日期过滤 Pandas DataFrame

python - 根据特定列或列中是否存在空值从 DataFrame 中选择行

python 循环中的交叉表函数

python - UTC ISO 8601 日期的时间戳

python - 启动 celery Worker 并启用它的广播队列

Python IOError : File not open for writing and global name 'w' is not defined 错误

python - 从 DataFrame Python 中删除列名后缀