linux - 是否有这样的命令可以在 shell 中合并多个文件？

例如，有 5 个数字 => [1,2,3,4,5] 和 3 个组

文件 1(组 1):

1
3
5

文件 2(组 2):

3
4

文件 3(组 3):

1
5

输出(column1:是否在Group1，column2:是否在Group2，column3:是否在Group3 [NA表示不..]):

1 NA 1
3 3 NA
NA 4 NA
5 NA 5

或者像这样(+ 表示在，- 表示不):

1 + - +
3 + + -
4 - + -
5 + - +

我尝试了 join 和 merge，但看起来它们都不能很好地用于多个文件..(例如， 8 个文件)

最佳答案

你说有数字 1-5，但据我所知，这与你想要的输出无关。您仅在输出中使用在您的文件中找到的数字。此代码将执行您想要的操作:

use strict;
use warnings;
use feature 'say';

my @hashes;
my %seen;
local $/;   # read entire file at once
while (<>) {
    my @nums = split;                          # split file into elements
    $seen{$_}++ for @nums;                     # dedupe elements
    push @hashes, { map { $_ => $_ } @nums };  # map into hash
}

my @all = sort { $a <=> $b } keys %seen;       # sort deduped elements
# my @all = 1 .. 5;                            # OR: provide hard-coded list

for my $num (@all) {                           # for all unique numbers
    my @fields;
    for my $href (@hashes) {                   # check each hash
        push @fields, $href->{$num} // "NA";   # enter "NA" if not found
    }
    say join "\t", @fields;                    # print the fields
}

您可以将 @all 中的已排序去重列表替换为 my @all = 1 .. 5 或任何其他有效列表。然后它将为这些数字添加行，并为缺失值打印出额外的“NA”字段。

你还应该知道，这依赖于你的文件内容是数字这一事实，但只是涉及到 @all 数组的排序，所以如果你将它替换为您自己的列表或您自己的排序例程，您可以使用任何值。

此脚本将获取任意数量的文件并处理它们。例如:

$ perl script.pl f1.txt f2.txt f3.txt
1       NA      1
3       3       NA
NA      4       NA
5       NA      5

归功于 Brent Stewart用于弄清楚 OP 的含义。

关于linux - 是否有这样的命令可以在 shell 中合并多个文件？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/15093874/

linux - 是否有这样的命令可以在 shell 中合并多个文件？

上一篇：ruby-on-rails - 为 Rails 应用设置 logrotate

下一篇：linux - 在停止等待协议(protocol)实现中使用 pthreads 时的 SIGSEGV