如果满足 NaN 阈值，Python 将删除 DF 中的所有特征实例

使用df.dropna(thresh = x, inplace=True) ，我可以成功删除至少缺少 x 的行非纳米值。

但是因为我的 df 看起来像:

          2001     2002     2003    2004

bob   A   123      31       4        12
bob   B   41        1       56       13
bob   C   nan      nan      4        nan

bill  A   451      8        nan      24
bill  B   32       5        52        6
bill  C   623      12       41       14

#Repeating features (A,B,C) for each index/name

这会删除 thresh= 所在的一行/实例满足条件，但保留该功能的其他实例。

What I want is something that drops the entire feature, if the thresh is met for any one row, such as:

df.dropna(thresh = 2, inplace=True):

           2001     2002     2003    2004

bob    A    123      31       4        12
bob    B    41        1       56       13

bill   A    451      8        nan      24
bill   B    32       5        52        6

#Drops C from the whole df

其中 C被从整个 df 中删除，而不仅仅是满足 bob 下的条件的一次

最佳答案

您的示例看起来像一个多索引索引数据帧，其中索引级别 1 是功能 A、B、C，索引级别 0 是名称。您可以使用 notna 和 sum 创建掩码来识别非 nan 值数量小于 2 的行并获取其索引级别 1 值。最后，使用 df.query 来切片行

a = df.notna().sum(1).lt(2).loc[lambda x: x].index.get_level_values(1)
df_final = df.query('ilevel_1 not in @a')

Out[275]:
         2001  2002  2003  2004
bob  A  123.0  31.0   4.0  12.0
     B   41.0   1.0  56.0  13.0
bill A  451.0   8.0   NaN  24.0
     B   32.0   5.0  52.0   6.0

方法2:
使用 notna、sum、groupby 和 transform 在具有以下条件的组上创建掩码 True大于或等于 2 的非 nan 值。最后，使用此掩码对行进行切片

m = df.notna().sum(1).groupby(level=1).transform(lambda x: x.ge(2).all())
df_final = df[m]

Out[296]:
         2001  2002  2003  2004
bob  A  123.0  31.0   4.0  12.0
     B   41.0   1.0  56.0  13.0
bill A  451.0   8.0   NaN  24.0
     B   32.0   5.0  52.0   6.0

关于如果满足 NaN 阈值，Python 将删除 DF 中的所有特征实例，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59593901/

如果满足 NaN 阈值，Python 将删除 DF 中的所有特征实例

What I want is something that drops the entire feature, if the `thresh` is met for any one row, such as:

上一篇：apache-kafka - 首选副本和领导者可以是不同的 Broker 吗？

下一篇：mongodb - 将 MongoDB 集合存储在不同的驱动器上

如果满足 NaN 阈值，Python 将删除 DF 中的所有特征实例

What I want is something that drops the entire feature, if the thresh is met for any one row, such as:

上一篇：apache-kafka - 首选副本和领导者可以是不同的 Broker 吗？

下一篇：mongodb - 将 MongoDB 集合存储在不同的驱动器上

What I want is something that drops the entire feature, if the `thresh` is met for any one row, such as: