我正在尝试选择数据框子集的子集,仅选择一些列,然后对行进行过滤。
df.loc[df.a.isin(['Apple', 'Pear', 'Mango']), ['a', 'b', 'f', 'g']]
但是,我收到了错误:
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.
现在切片和过滤的正确方法是什么?
最佳答案
TL;DR:列标题名称中可能存在拼写错误或拼写错误。
这是 v0.21.1
中引入的更改,已在 docs 中进行了说明。长篇大论-
Previously, selecting with a list of labels, where one or more labels were missing would always succeed, returning
NaN
for missing labels. This will now show aFutureWarning
. In the future this will raise aKeyError
(GH15747). This warning will trigger on aDataFrame
or aSeries
for using.loc[]
or[[]]
when passing a list-of-labels with at least 1 missing label.
例如,
df
A B C
0 7.0 NaN 8
1 3.0 3.0 5
2 8.0 1.0 7
3 NaN 0.0 3
4 8.0 2.0 7
在你做的时候尝试一些切片 -
df.loc[df.A.gt(6), ['A', 'C']]
A C
0 7.0 8
2 8.0 7
4 8.0 7
没问题。现在,尝试用不存在的列标签替换 C
-
df.loc[df.A.gt(6), ['A', 'D']]
FutureWarning: Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.
A D
0 7.0 NaN
2 8.0 NaN
4 8.0 NaN
因此,在您的情况下,错误是因为您传递给 loc
的列标签。再看看他们。
关于python - Pandas 使用 0.21.0 对 FutureWarning 进行切片,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47896240/