awk - 删除具有相同字段/字符串的行

我重写了之前的问题，因为它不清楚。

我有 test1.txt 以这种方式格式化(此示例有 3 行)

Link;alfa (zz);some text;
Link;alfa (zz);other text;other text2;
Link;jack;

在另一个文件文本test2.txt中，我的文本以这种方式格式化，没有分隔符;但只是简单的字符串(本例有 2 行)

tommy emmanuel
alfa (zz)

在 test2.txt我从来没有Link我可以的话( )但我从来没有;分隔符

我想要以这种方式编写result.txt

Link;jack;

背后的逻辑:在test2.txt中我有alfa (zz) 。此字符串/字段与 test1.txt 中的相同 - 我有相同字符串 alfa (zz)在第一行和第二行 ; 之间分隔符。条件:如果发生此字段匹配，则应删除该行，因此我写道我只期望第三行

Link;jack;

我测试了这段代码

sed 's/.*Link;//;s/;.*//' test2.txt | grep -Fvf- test1.txt

还有这个

awk -F \; '
FNR == NR {cull[$0]=""}
FNR != NR {
    for (str in cull) {
        if ($2 == str) {
            next
        }
    }
    print
}' test2.txt test1.txt > culled.txt

问题是它重写了我相同的行并且不删除具有相同字段的行

更新问题:

根据anubhava的回答和this例如，这种字符串的存在不会删除行

如果我里面有test2.txt

Dark Tranquillity - A Moonclad Reflection [ep] (1992) Melodic Death Metal
Dark Tranquillity - A Closer End [best of_compilation] (2008) Melodic Death Metal

如果我有这些行，我就无法匹配和删除 text1.txt 中的行

Link;Dark Tranquillity - A Moonclad Reflection [ep] (1992) Melodic Death Metal;Dark Tranquillity - A moonclad reflection [7'' Ep 1992_Slaughter Rec.].rar;https://disk.yandex.com/public?hash=JA7Gu2CysxSf2HhAKaBxmU%2By27B6dPd6uRwPFu%2B9x0s%3D;https://metalarea.org/forum/index.php?showtopic=5037

Link;Dark Tranquillity - A Closer End [best of_compilation] (2008) Melodic Death Metal;Dark Tranquillity - A Closer End [2008].rar;https://disk.yandex.com/public?hash=RCZbOrqci8lX%2Fa%2BPzhB6vchlr5rXyc%2B2NHiJNCu%2BQYM%3D;https://metalarea.org/forum/index.php?showtopic=48557

最佳答案

您可以使用这个awk解决方案:

awk -F';' 'FNR == NR {
   gsub(/^[[:space:]]+|[[:space:]]+$/, "")
   cull[$0]
   next
}
!($2 in cull)' test2.txt test1.txt > culled.txt

cat culled.txt

Link;Dark Tranquillity - Enter Suicidal Angels [ep] (1996) Melodic Death Metal;Dark Tranquillity 1996 - Enter Suicidal Angels (EP).rar;https://disk.yandex.com/public?hash=fBvwBTBJ8%2Fx1mWXvl7usrAMe06esHZFDmHJWF8E2T6LK7Wvfu9Q5Qja9cb5JAU%2Fzq%2FJ6bpmRyOJonT3VoXnDag%3D%3D;https://metalarea.org/forum/index.php?showtopic=5041
Link;Sea Of Tranquillity - Darkened [demo] (1993) Death_Thrash Metal;eaofT3Dd.7z;https://disk.yandex.com/public?hash=DzCNqfEv2pydYzB0YWVvetV2Jx8QDCwktop3y8PIC%2BD5W%2Fnt8ikX81%2F7cf49g8dNq%2FJ6bpmRyOJonT3VoXnDag%3D%3D;https://metalarea.org/forum/index.php?showtopic=153504
Link;Dark Tranquillity - The Absolute [single] (2017) Melodic Death Metal (D);Dark Tranquillity - The Absolute (2017) Single MCD (+SATANIST666+).rar;https://cloud.mail.ru/public/ckFc/A5sQ6pqhb;
Link;Dark Tranquillity - Trail Of Life Decayed [ep] (1992) Melodic Death Metal;1991 - Trail Of Life Decayed.7z;https://www.mediafire.com/file/5pi74bqvujea9rg/1991_-_Trail_Of_Life_Decayed.7z/file;https://metalarea.org/forum/index.php?showtopic=5115
Link;Dark Tranquillity - Phantom Days [single] (2020) Melodic Death Metal (D);Dark Tranquillity - Phantom Days (2020) by Andrew.rar;https://www.mediafire.com/file/ho25i02j3ybgmty/Dark_Tranquillity_-_Phantom_Days_%25282020%2529_by_Andrew.rar/file;https://metalarea.org/forum/index.php?showtopic=341548
Link;Dark Tranquillity - Of Chaos And Eternal Night [ep] (1995) Melodic Death Metal;Dark Tranquillity - Of Chaos And Eternal Night (EP) [1995].rar;https://disk.yandex.com/public?hash=Ax%2B2Gfqzr9%2FdS87cgRcUhGBCoQzKZfz5ZDUa2U%2Fbsn4%3D;https://metalarea.org/forum/index.php?showtopic=5040
Link;Dark Tranquillity - Of Chaos And Eternal Night [ep] (1995) Melodic Death Metal;Dark Tranquillity 1995 - Of Chaos And Eternal Night (EP).rar;https://disk.yandex.com/public?hash=le8r7ZI%2F%2BTw2CjsDbNFriDZdCGpSy1hj%2BoQdGHrrBFdcJM8eIp%2F3J17qG5MjC1Fgq%2FJ6bpmRyOJonT3VoXnDag%3D%3D;https://metalarea.org/forum/index.php?showtopic=5040

无需使用for循环。只需在从第二个文件读取内容时创建一个关联数组，然后在读取第一个文件时仅打印第二列不在数组中的行seen。

关于awk - 删除具有相同字段/字符串的行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/73513409/

awk - 删除具有相同字段/字符串的行

上一篇：javascript - 自动完成:如何在数据源上有多个值

下一篇：python - 使用正则表达式随机化 "<"和 ">"符号之间定义的元素位置