使用正则表达式进行 Python 文件搜索

标签 python regex

我有一个包含很多行的文件。每行以 {"id": 开头，后跟引号中的 id 编号。 (即{“id”:“106”)。我正在尝试使用正则表达式逐行搜索整个文档并打印与 5 个不同 id 值匹配的行。为此，我创建了一个包含 ids 的列表，并希望迭代该列表，仅匹配以 {"id": "(list 中的 id number)"开头的行。我真的很困惑如何做到这一点。这是我到目前为止所拥有的:

f= "bdata.txt"    
statids = ["85", "106", "140", "172" , "337"] 
x= re.findall('{"id":', statids, 'f')
for line in open(file):
            print(x)

我不断收到的错误代码是:TypeError: unsupported operand type(s) for &: 'str' and 'int'

我需要匹配整行，以便我可以将其拆分并放入一个类中。

有什么建议吗？感谢您抽出时间。

最佳答案

您可以使用 regex 从该行检索 ID , ^\{\"id\":\"(\d+)\" 其中 group#1 的值将为您提供 id。然后，您可以检查 id 是否存在于 statids 中。

演示:

import re

statids = ["85", "106", "140", "172", "337"]

with open("bdata.txt") as file:
    for line in file:
        search = re.search('^\{\"id\": \"(\d+)\"', line)
        if search:
            id = search.group(1)
            if id in statids:
                print(line.rstrip())

对于文件中的以下示例内容:

{"id": "100" hello
{"id": "106" world
{"id": "2" hi
{"id": "85" bye
{"id": "10" ok
{"id": "140" good
{"id": "165" fine
{"id": "172" great
{"id": "337" morning
{"id": "16" evening

输出将是:

{"id": "106" world
{"id": "85" bye
{"id": "140" good
{"id": "172" great
{"id": "337" morning

关于使用正则表达式进行 Python 文件搜索，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/69637474/

上一篇：python - 使用 tika-python 增加 Python 中的 tika 堆大小

下一篇：amazon-web-services - 可以使用增量(但唯一)ID 作为分区键在 DynamoDB 中创建热分区吗？

相关文章：

python - 使用 Python 多处理以固定速率安排任务

python - 计算我的函数的大 o

python - 在不完整/未填充的行中居中 matplotlib 图例条目？

python - 在 apscheduler 中的作业调度中，下一次运行时间错过了几秒钟

javascript - 匹配并返回两个值之间的正则表达式

java - 使用正则表达式重新格式化代码

python - 如何在 WTForms 中使用 "tel"、 "number"或其他输入类型？

javascript - 字符串替换正则表达式，从 Facebook 内容中删除符号

regex - 如何匹配除特定字符之外的任何非空白字符？

java - 正则表达式否定模式