python - 根据字符串过滤 pandas 上的列

我试图根据字符串过滤 pandas 上的列，但我面临的问题是行是列表而不仅仅是字符串。

该列的一个小示例

tags
['get_mail_mail']
['app', 'oflline_hub', 'smart_home']
['get_mail_mail', 'smart_home']
['web']
[]
[]
['get_mail_mail']

我正在使用这个

df[df["tags"].str.contains("smart_home", case=False, na=False)]

但它返回一个空数据框。

最佳答案

您可以explode ，然后与 groupby.any 进行比较并聚合:

m = (df['tags'].explode()
     .str.contains('smart_home', case=False, na=False)
     .groupby(level=0).any()
    )

out = df[m]

或者用分隔符连接字符串并使用str.contains:

out = df[df['tags'].agg('|'.join).str.contains('smart_home')]

或者使用列表理解:

out = df[[any(s=='smart_home' for s in l) for l in df['tags']]]

输出:

                             tags
1  [app, oflline_hub, smart_home]
2     [get_mail_mail, smart_home]

关于python - 根据字符串过滤 pandas 上的列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/74039033/

相关文章：

python - 我如何从两个嵌套列表(父级和子级)中创建目录列表