python - 根据列对文件进行排序并获取 uniq 元素

我想根据文件的内容对原始文件进行排序，并获取该列中的唯一元素:

原始文件:

qoow_12_xx7_21  wer1    rwty3
asss_x17_211    aqe3    sda4
acyi_112_werxc  xcu12   weqa1
qwer_234_ssd    aqe3    wers

输出排序数据:

asss_x17_211    aqe3    sda4
qwer_234_ssd    aqe3    wers
qoow_12_xx7_21  wer1    rwty3
acyi_112_werxc  xcu12   weqa1

输出唯一的col2:

aqe3
wer1
xcu12

我的尝试不起作用代码:

from operator import itemgetter
import itemgetter


def get_unique(data):
    seen=""
    for e in data:
        if e not in seen:
            seen="\t".join(seen) 
    return seen

col2=""
with open("myfile.txt", "r") as infile, open("out.xls","w") as outfile:
    for line in infile:
        data=line.rstrip.split("\t")
        sorted_data=sorted(data, key=lambda e: e.itemgetter)
        col2="".join(data[1])
    uniq_col2=get_unique(col2)
    outfile.write(sorted_data)# tab-delimited sorted data
    outfile.write(uniq_col2) # sorted column 2 data

有人可以帮助使此代码正常工作吗？谢谢

最佳答案

试试这个:

from operator import itemgetter

with open('test.txt') as infile, open('out.txt', 'w') as outfile:
    # sort input by 2nd column
    sorted_lines = sorted(
        (line.strip().split() for line in infile),
        key=itemgetter(1)
        )

    # output sorted input
    for line in sorted_lines:
        outfile.write('\t'.join(line))
        outfile.write('\n')

    # discard duplicates in already sorted sequence => uniq items
    prev_item = None
    for item in (line[1] for line in sorted_lines):
        if item != prev_item:
            prev_item = item
            outfile.write(item)
            outfile.write('\n')

关于python - 根据列对文件进行排序并获取 uniq 元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27197047/

上一篇：python - 递归抓取页面

下一篇：python - 寻找解决 pygame 程序中逻辑错误的方法

相关文章：

python - 自定义 Cython 生成的 .so 文件的位置

python - 如何使用 scikit 正确进行一种热编码？

python - 如何将 'instance keys' 添加到 keras 模型输入以在 gcloud ai-platform 中进行批量预测？

python - 使用 Python 语法的子集加速编写 C 程序

python - 如何更改变量中存在的路径字符串(作为循环的一部分)？

python - 安装 Flask + gunicorn pip 后有意外的依赖

python - 无法在 unicode 中打印某些上标

python - 如何在 SAP Logon 中自动执行重复任务

python - 对特定语法进行 pyparsing 时未获得预期结果

python - python 的可变长度参数 (*args) 会在函数调用时扩展生成器吗？