python - 如何从 Python 2.7 中的 csv 读取数据中找到最大数量？

有一个名为 vic_visitors.csv 的 CSV，其中包含以下数据:

Victoria's Regions,2004,2005,2006,2007
Gippsland,63354,47083,51517,54872
Goldfields,42625,36358,30358,36486
Grampians,64092,41773,29102,38058
Great Ocean Road,185456,153925,150268,167458
Melbourne,1236417,1263118,1357800,1377291

还有一个问题要问:

Q. Write a program to find the greatest visitor number in Victoria from the CSV data vic_visitors.csv. Your program should print the result in the format "The greatest visitornumber was 'x' in 'y' in the year 'z'.

直到这里我都可以访问数据，这样 data_2d 就可以通过 data_2d[i]=row 和 data_2d[i][j]= 为我提供二维信息列:

import csv
visitors=open("vic_visitors.csv")
data=csv.reader(visitors)
data_2d=list(data)

但是我对如何检索最大人数及其对应的地区和年份一头雾水。

最佳答案

您有 4 个问题需要解决:

您需要保留列标题，以便正确报告年份
csv 以字符串形式提供所有内容，而您需要比较数值
您需要找到每一行的最大值。
您需要根据给定行的最大值找到最大行。

您可以使用 DictReader() 来解决第一部分。您可以在读取文件时将值转换为整数，或者在确定最大值时将值转换。您可以在阅读时或在执行最后一步时一次性确定每行的最大值。

我会尽可能多地阅读，丢弃过程中不需要的任何数据:

import csv

maximum_value = None
with open("vic_visitors.csv", 'rb') as visitors:
    reader = csv.DictReader(visitors)
    for row in reader:
        count, year = max((int(row[year]), year) for year in reader.fieldnames[1:])  # skip the first column
        if not maximum_value or count > maximum_value[0]:
            maximum_value = (count, row[reader.fieldnames[0]], year)

print "The greatest visitornumber was {} in {} in the year {}.".format(
    *maximum_value)

max(...) 行循环遍历每一行字典中的键值对(它使用 CSV 的第一行作为键)，选择年份列(所以所有领域，但第一个)。通过将数值放在首位，您可以获得该行的最大列值，并与年份配对。

然后我们存储到目前为止找到的最大行信息(仅计数、地区和年份)；无需保留任何其他行。然后通过将这 3 个值插入模板来在最后格式化该元组。

通过使用 DictReader.fieldnames list，我们保持了灵 active ；只要第一列是一个地区，其余的是年份，代码就会适应任何变化。

演示:

>>> import csv
>>> sample = '''\
... Victoria's Regions,2004,2005,2006,2007
... Gippsland,63354,47083,51517,54872
... Goldfields,42625,36358,30358,36486
... Grampians,64092,41773,29102,38058
... Great Ocean Road,185456,153925,150268,167458
... Melbourne,1236417,1263118,1357800,1377291
... '''.splitlines(True)
>>> maximum_value = None
>>> reader = csv.DictReader(sample)
>>> for row in reader:
...     count, year = max((int(row[year]), year) for year in reader.fieldnames[1:])  # skip the first column
...     if not maximum_value or count > maximum_value[0]:
...         maximum_value = (count, row[reader.fieldnames[0]], year)
... 
>>> print "The greatest visitornumber was {} in {} in the year {}.".format(
...     *maximum_value)
The greatest visitornumber was 1377291 in Melbourne in the year 2007.

关于python - 如何从 Python 2.7 中的 csv 读取数据中找到最大数量？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30751197/

python - 如何从 Python 2.7 中的 csv 读取数据中找到最大数量？

上一篇：python - 从链接中提取 Scrapy

下一篇：python - 使用正则表达式从字符串中提取所有匹配的单词