perl - 如何从列表中删除可以在列表中其他较长行中找到的行？

我有一个文件，list.txt，如下所示:

cat
bear
tree
catfish
fish
bear

我需要删除文档中其他地方已经完全找到的任何行，无论是作为重复行，还是在另一条较长的行中找到。例如，“bear”和“bear”这行是相同的，所以删除其中一个； “cat”完全可以在“catfish”中找到，所以删除“cat”。输出看起来像这样:

catfish
tree
bear

如何删除所有重复行，包括列表中较长行内的行？

到目前为止，我有这个:

#!/bin/bash
touch list.tmp
while read -r line
do
    found="$(grep -c $line list.tmp)"
    if [ "$found" -eq "1" ]
    then
        echo $line >> list.tmp
        echo $line" added"
    else
        echo "Not added."
fi
done < list.txt

最佳答案

如果 O(N^2) 不打扰你:

#!/usr/bin/env perl

use strict;
use warnings;
use List::MoreUtils qw{any};

my @words;
for my $word (
    sort {length $b <=> length $a}
    do {
        my %words;
        my @words = <>;
        chomp @words;
        @words{@words} = ();
        keys %words;
    }
)
{
    push @words, $word unless do {
        my $re = qr/\Q$word/;
        any {m/$re/} @words;
    };
}

print "$_\n" for @words;

如果您想要 O(NlogN)，则必须使用某种 trie 方法。例如使用后缀树:

#!/usr/bin/env perl

use strict;
use warnings;
use Tree::Suffix;

my $tree = Tree::Suffix->new();

my @words;
for my $word (
    sort {length $b <=> length $a}
    do {
        my %words;
        my @words = <>;
        chomp @words;
        @words{@words} = ();
        keys %words;
    }
)
{
    unless ($tree->find($word)){
        push @words, $word;
        $tree->insert($word);
    };
}

print "$_\n" for @words;

关于perl - 如何从列表中删除可以在列表中其他较长行中找到的行？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/17778094/

perl - 如何从列表中删除可以在列表中其他较长行中找到的行？

上一篇：macos - 截取当前 OS X 或 iTerm 终端窗口的屏幕截图

下一篇：Bash Tab 补全提示隐藏的 SVN 文件