python - 如何使用 Python 对同名数组进行分组？

我在一个文本文件中有超过一千个数组类别，例如:

Category A1 和 Cateogry A2:(matlab 代码中的数组)

A1={[2,1,2]};
A1={[4,2,1,2,3]};
A2={[3,3,2,1]};
A2={[4,4,2,2]};
A2={[2,2,1,1,1]};

我想使用 Python 来帮助我读取文件并将它们分组为:

A1=[{[2,1,2]} {[4,2,1,2,3]}];  
A2=[{[3,3,2,1]} {[4,4,2,2]} {[2,2,1,1,1]}];

最佳答案

使用字典进行分组，我猜你的意思是分组为字符串，因为它们不是来自 .mat matlab 文件的有效 python 容器:

from collections import OrderedDict
od = OrderedDict()
with open("infile") as f:
    for line in f:
        name, data = line.split("=")
        od.setdefault(name,[]).append(data.rstrip(";\n"))

from pprint import pprint as pp
pp((od.values()))
[['{[2,1,2]}', '{[4,2,1,2,3]}'],
['{[3,3,2,1]}', '{[4,4,2,2]}', '{[2,2,1,1,1]}']]

要对文件中的数据进行分组，只需编写内容:

with open("infile", "w") as f:
    for k, v in od.items():
        f.write("{}=[{}];\n".format(k, " ".join(v))))

输出:

A1=[{[2,1,2]} {[4,2,1,2,3]}];
A2=[{[3,3,2,1]} {[4,4,2,2]} {[2,2,1,1,1]}];

这实际上是您想要的输出，其中从每个子数组中删除了分号，将元素分组并将分号添加到组的末尾以保持数据在您的 matlab 文件中有效。

collections.OrderedDict将保留原始文件中的顺序，使用普通字典将没有顺序。

更新文件时更安全的方法是写入临时文件，然后使用 NamedTemporaryFile 将原始文件替换为更新的文件和 shutil.move :

from collections import OrderedDict

od = OrderedDict()
from tempfile import NamedTemporaryFile
from shutil import move

with open("infile") as f, NamedTemporaryFile(dir=".", delete=False) as temp:
    for line in f:
        name, data = line.split("=")
        od.setdefault(name, []).append(data.rstrip("\n;"))
    for k, v in od.items():
        temp.write("{}=[{}];\n".format(k, " ".join(v)))
move(temp.name, "infile")

如果代码在循环中出错或您的组件在写入过程中崩溃，您的原始文件将被保留。

关于python - 如何使用 Python 对同名数组进行分组？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30556404/

python - 如何使用 Python 对同名数组进行分组？

上一篇：python - 将列表或用户生成的字符串转换为单个字符串

下一篇：python - 从 MATLAB 到 Python 的无迭代实现