python - 通过 python 中的 namedtuple csv 循环跟踪进度

使用 collections.namedtuple，以下 Python 代码通过标识符的 csv 文件(名为 ContentItemId 的列中的整数)处理数据库中的记录。一个示例记录是 https://api.aucklandmuseum.com/id/library/ephemera/21291 .

它的目的是检查给定 id 的 HTTP 状态并将其写入磁盘:

import requests
from collections import namedtuple
import csv

with open('in.csv', mode='r') as f:
    reader = csv.reader(f)

    all_records = namedtuple('rec', next(reader))
    records = [all_records._make(row) for row in reader]

    #Create output file
    with open('out.csv', mode='w+') as o:
        w = csv.writer(o)
        w.writerow(["ContentItemId","code"])

        count = 1
        for r in records:
            id   = r.ContentItemId
            url  = "https://api.aucklandmuseum.com/id/library/ephemera/" + id
            req  = requests.get(url, allow_redirects=False)
            code = req.status_code
            w.writerow([id, code])

如何通过后一个循环将代码的进度(理想情况下为 25%、50% 和 75% 的接合点)打印到控制台？另外，如果我在底部添加一个未缩进的 print("Complete")，是否会到达该行？

提前致谢。

编辑:感谢所有帮助。我的(工作!)代码现在看起来像这样:

import csv
import requests
import pandas
import time
from collections import namedtuple
from tqdm import tqdm

with open('active_true_pub_no.csv', mode='r') as f:
    reader = csv.reader(f)

    all_records = namedtuple('rec', next(reader))
    records = [all_records._make(row) for row in reader]

    with open('out.csv', mode='w+') as o:
        w = csv.writer(o)
        w.writerow(["ContentItemId","code"])

        num = len(records)
        print("Checking {} records...\n".format(num))

        with tqdm(total=num, bar_format="{percentage:3.0f}% {bar} [{n_fmt}/{total_fmt}]  ", ncols=64) as pbar:
            for r in records:
                pbar.update(1)
                id   = r.ContentItemId
                url  = "https://api.aucklandmuseum.com/id/library/ephemera/" + id
                req  = requests.get(url, allow_redirects=False)
                code = req.status_code
                w.writerow([id, code])
                # time.sleep(.25)

print ('\nSummary: ')
df = pandas.read_csv("out.csv")
print(df['code'].value_counts())

我用过 pandas' value_counts最后总结结果。

最佳答案

要获得进度条，请使用 TQDM:

tqdm

数据(来自`in.csv`):

ContentItemId
21200
21201
21202
21203
21204
21205
21206
...
21296
21297
21298
21299
21300

代码:

from collections import namedtuple
import csv
import requests
from tqdm import tqdm


with open('in.csv', mode='r') as f:
    reader = csv.reader(f)

    all_records = namedtuple('rec', next(reader))
    records = [all_records._make(row) for row in reader]

    #Create output file
    with open('out.csv', mode='w+') as o:
        w = csv.writer(o)
        w.writerow(["ContentItemId","code"])

        count = 1

        with tqdm(total=len(records)) as pbar:
            for r in records:
                pbar.update(1)
                id   = r.ContentItemId
                url  = "https://api.aucklandmuseum.com/id/library/ephemera/" + id
                req  = requests.get(url, allow_redirects=False)
                code = req.status_code
                w.writerow([id, code])
    print('Complete!')

注意在 for-loop 之前添加 with tqdm(total=len(records)) as pbar:
从控制台运行时，会出现一个进度条，显示完成百分比。
注意图像的左侧，21/101，这是对记录 列表长度的计数。
- tqdm 提供百分比进度条和 complete/total

关于python - 通过 python 中的 namedtuple csv 循环跟踪进度，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57948562/

python - 通过 python 中的 namedtuple csv 循环跟踪进度

要获得进度条，请使用 TQDM:

数据(来自`in.csv`):

代码:

上一篇：python - 如何按整数对元组的混合列表进行排序？

下一篇：python - 使用 sqlite OperationalError : no such column 创建函数

python - 通过 python 中的 namedtuple csv 循环跟踪进度

要获得进度条，请使用 TQDM:

数据(来自in.csv):

代码:

上一篇：python - 如何按整数对元组的混合列表进行排序？

下一篇：python - 使用 sqlite OperationalError : no such column 创建函数

数据(来自`in.csv`):