给定以下无序制表符分隔文件:
Asia Srilanka
Srilanka Colombo
Continents Europe
India Mumbai
India Pune
Continents Asia
Earth Continents
Asia India
目标是生成以下输出(制表符分隔):
Earth Continents Asia India Mumbai
Earth Continents Asia India Pune
Earth Continents Asia Srilanka Colombo
Earth Continents Europe
我创建了以下脚本来实现目标:
root={} # this hash will finally contain the ROOT member from which all the nodes emanate
link={} # this is to hold the grouping of immediate children
for line in f:
line=line.rstrip('\r\n')
line=line.strip()
cols=list(line.split('\t'))
parent=cols[0]
child=cols[1]
if not parent in link:
root[parent]=1
if child in root:
del root[child]
if not child in link:
link[child]={}
if not parent in link:
link[parent]={}
link[parent][child]=1
现在我打算使用之前创建的两个字典(root 和 link)打印所需的输出。我不确定如何在 python 中执行此操作。但我知道我们可以在 perl 中编写以下内容来实现结果:
print_links($_) for sort keys %root;
sub print_links
{
my @path = @_;
my %children = %{$link{$path[-1]}};
if (%children)
{
print_links(@path, $_) for sort keys %children;
}
else
{
say join "\t", @path;
}
}
你能帮我在 python 3.x 中实现所需的输出吗?
最佳答案
我在这里看到下一个问题:
- 从文件中读取关系;
- 根据关系构建层次结构。
- 将层次结构写入文件。
假设层次树的高度小于默认recursion limit (在大多数情况下等于 1000
),让我们为这个单独的任务定义实用函数。
实用程序
关系解析可以用
def parse_relations(lines): relations = {} splitted_lines = (line.split() for line in lines) for parent, child in splitted_lines: relations.setdefault(parent, []).append(child) return relations
构建层次结构可以用
Python >=3.5
def flatten_hierarchy(relations, parent='Earth'): try: children = relations[parent] for child in children: sub_hierarchy = flatten_hierarchy(relations, child) for element in sub_hierarchy: try: yield (parent, *element) except TypeError: # we've tried to unpack `None` value, # it means that no successors left yield (parent, child) except KeyError: # we've reached end of hierarchy yield None
Python <3.5:扩展的可迭代拆包 was added with PEP-448 , 但它可以替换为
itertools.chain
喜欢import itertools def flatten_hierarchy(relations, parent='Earth'): try: children = relations[parent] for child in children: sub_hierarchy = flatten_hierarchy(relations, child) for element in sub_hierarchy: try: yield tuple(itertools.chain([parent], element)) except TypeError: # we've tried to unpack `None` value, # it means that no successors left yield (parent, child) except KeyError: # we've reached end of hierarchy yield None
层次结构导出到文件可以用
def write_hierarchy(hierarchy, path, delimiter='\t'): with open(path, mode='w') as file: for row in hierarchy: file.write(delimiter.join(row) + '\n')
用法
假设文件路径是'relations.txt'
:
with open('relations.txt') as file:
relations = parse_relations(file)
给我们
>>> relations
{'Asia': ['Srilanka', 'India'],
'Srilanka': ['Colombo'],
'Continents': ['Europe', 'Asia'],
'India': ['Mumbai', 'Pune'],
'Earth': ['Continents']}
我们的层次结构是
>>> list(flatten_hierarchy(relations))
[('Earth', 'Continents', 'Europe'),
('Earth', 'Continents', 'Asia', 'Srilanka', 'Colombo'),
('Earth', 'Continents', 'Asia', 'India', 'Mumbai'),
('Earth', 'Continents', 'Asia', 'India', 'Pune')]
最后将其导出到名为 'hierarchy.txt'
的文件中:
>>> write_hierarchy(sorted(hierarchy), 'hierarchy.txt')
(我们使用 sorted
来获取您想要的输出文件中的层次结构)
P. S.
如果您不熟悉 Python
generators我们可以像这样定义 flatten_hierarchy
函数
Python >= 3.5
def flatten_hierarchy(relations, parent='Earth'): try: children = relations[parent] except KeyError: # we've reached end of hierarchy return None result = [] for child in children: sub_hierarchy = flatten_hierarchy(relations, child) try: for element in sub_hierarchy: result.append((parent, *element)) except TypeError: # we've tried to iterate through `None` value, # it means that no successors left result.append((parent, child)) return result
python < 3.5
import itertools def flatten_hierarchy(relations, parent='Earth'): try: children = relations[parent] except KeyError: # we've reached end of hierarchy return None result = [] for child in children: sub_hierarchy = flatten_hierarchy(relations, child) try: for element in sub_hierarchy: result.append(tuple(itertools.chain([parent], element))) except TypeError: # we've tried to iterate through `None` value, # it means that no successors left result.append((parent, child)) return result
关于Python - 创建层次结构文件(在表示为表的树中查找从根到叶的路径),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44236188/