python - 在 Python 中过滤数据集中的 'contains value'

<分区>

在 Python 中，如何根据包含特定值的值过滤列？

一个例子是一个数据集，其中有一列名为“城市”，值可以是“悉尼”、“大悉尼”、“北悉尼”等。如果使用输入“悉尼”，我如何确保所有变化是否包含在过滤中？

#user inputs column
input1 = input()
country_city = input1.title()

#user inputs value
input2 = input()
country_city_value = input2.title()

#filtering step (current)
filtered = dataset[dataset[country_city] == country_city_value]
print(filtered)

最佳答案

str.contains 是个好方法，但如果您的输入是“North Sydney”，您将不会收到 Sydney 结果，只有 north悉尼 示例:

df= pd.DataFrame({
    'A':['Sydney','North Sydney','Alaska']

})
print(df)
              A
0        Sydney
1  North Sydney
2        Alaska
input='North Sydney'
filtered = df[df.A.str.contains(input)]

print(filtered)
              A
1  North Sydney

因此，要改进这种方式，请使用 split() with str.contains()

input=input.split()
print(input)
['North', 'Sydney']

filtered = df[df.A.str.contains('%s'%[x for x in input])]

print(filtered)
              A
0        Sydney
1  North Sydney

所以通过这种方式，你确定你输入的所有部分都会被考虑在内

关于python - 在 Python 中过滤数据集中的 'contains value'，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63628564/

上一篇：python - 每个值都取决于另一个 df 查询的 Pandas 列

下一篇：python - 按功能过滤 Pandas 索引

python - 设置函数以字符串形式返回其结果值

python - pandas - 将字符串转换为字符串列表

python - 如何按列拆分 DataFrame

python - 迭代 pandas 系列并格式化日期

python - 使用 lambda 和 defaultdict

python - python 文件的奇怪 IDE 行为

python - 匹配 Python 字符串中的精确短语

python - python - 如何在for循环中一次访问两个元素而不会在python中重复？

Python 日期时间过程