python - 从文件读取并作为字典返回的函数?

标签 python python-3.x dictionary

正在学习 python,但无法理解如何创建此函数来读取文件并将其作为字典返回。我知道我需要打开文件,然后使用 .read(),但到目前为止我不确定如何对数据进行排序。由于会有多个“标题”,因此我尝试将大写字母排序在所有小写字母之前。关于如何继续的任何建议?

我到目前为止的代码:

def read_text(textname):
    d = {}
    with open(textname) as f:
        for line in f:
            (title, year, height, width, media, country) = line.split() # I need to skip the first line in the file as well which just shows the categories.

文本文件示例:

text0='''"Artist","Title","Year","Total Height","Total 
Width","Media","Country"
"Leonardo da Vinci","Mona Lisa","1503","76.8","53.0","oil paint","France"
"Leonardo da Vinci","The Last Supper","1495","460.0","880.0","tempera","Italy" 

我想要返回的文件为:

{'Leonardo da Vinci': [("Mona Lisa",1503,76.8,53.0,"oil paint","France"),
('The Last Supper', 1495, 460.0, 880.0, 'tempera', 'Italy')]}

最佳答案

一种方法是对 dict 使用 csv 模块和 setdefault 方法:

>>> import csv
>>> with open('data.csv') as f:
...   d = {}
...   reader = csv.reader(f)
...   header = next(f) # skip first line, save it if you want to
...   for line in reader:
...     artist, *rest = line
...     d.setdefault(artist,[]).append(tuple(rest))
... 
>>> d
{'Leonardo da Vinci': [('Mona Lisa', '1503', '76.8', '53.0', 'oil paint', 'France'), ('The Last Supper', '1495', '460.0', '880.0', 'tempera', 'Italy')]} 

更Pythonic的方法是使用defaultdict:

>>> from collections import defaultdict
>>> with open('data.csv') as f:
...   d = defaultdict(list)
...   reader = csv.reader(f)
...   header = next(f) # skip header
...   for line in reader:
...     artist, *rest = line
...     d[artist].append(rest)
... 
>>> d
defaultdict(<class 'list'>, {'Leonardo da Vinci': [('Mona Lisa', '1503', '76.8', '53.0', 'oil paint', 'France'), ('The Last Supper', '1495', '460.0', '880.0', 'tempera', 'Italy')]})
>>> 

找出获取所需数据类型的最佳方法留作练习......显然这整件事从一开始就是如此。

关于python - 从文件读取并作为字典返回的函数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40577775/

相关文章:

python - 在 Pandas 中屏蔽工作日的特定时间段

python - 不能将 librosa 与 python 3 一起使用

python - Pandas 重命名索引值

python - pyarrow 内存泄漏?

python - 查找 Markdown 代码块之外的图像标签

python - 正则表达式匹配引号中仅包含 3 个或更少大写单词的字符串

python - Pyplot 散点图,使用facecolors ='none' ,并将边缘颜色保持为默认的确定性标记颜色选择

python - 按层次结构从许多字典中获取值

ios - 不能分配给任何对象类型的不可变表达式

java - 如何从 Java 中的整数映射和长整型列表返回一定范围内的随机值