for-loop - 我可以用 awk "for loop"执行 '!seen[$0]' 来应用于多个 .txt 文件吗?

标签 for-loop awk

我想删除多个.text文件中的一些重复行 我需要独立分析每个文件,因为它们彼此不相关,使用:

awk '!seen[$0]' file.txt file.out

但是当我尝试时:

for f in *.txt; do awk 'seen![$0]' $f $f.out; done

我收到错误:
无法打开文件“$f”。 awk 或 '!已看到 [$ 0]' 无法识别 有时我会得到一个输出文件,但它是同一个文件......

最佳答案

对于非 GNU awk,请尝试以下操作。

awk -v temp_out="file.out" '
FNR==1{
  if(prev_filename){
    close(temp_out)
    sub(/\.txt/,".out",prev_filename)
    system("mv -- \047" prev_out "\047 \047" prev_filename "\047")
  }
  prev_filename=FILENAME
  delete seen
}
!seen[$0]++{
  print > (temp_out)
}
END{
  if(prev_filename){
    close(temp_out)
    sub(/\.txt/,".out",prev_filename)
    system("mv -- \047" prev_out "\047 \047" prev_filename "\047")
  }
}
' *.txt

说明:为上述代码添加说明。

awk -v temp_out="file.out" '                        ##Starting awk program from here with setting variable prev_out to file.out here.
FNR==1{                                             ##Checking condition if line is first line then do following.
  if(prev_filename){                                ##Checking if prev_filename is NOT NULL then do following.
    close(temp_out)                                 ##Closing prev_out file here from back-end.
    sub(/\.txt/,".out",prev_filename)               ##Substitute .txt with .out in previous filename here.
    system("mv -- \047" prev_out "\047 \047" prev_filename "\047")        ##Using system command to rename temp file prev_out with prev_filename(with .out)
  }
  prev_filename=FILENAME                            ##Setting prev_filename to current FILENAME here.
  delete seen                                       ##Deleting array seen here.
}
!seen[$0]++{                                        ##Checking if current line is NOT present in array seen then do following.
  print > (temp_out)                                ##Printing current line to temp file here.
}
END{                                                ##Starting END block of this program from here.
  if(prev_filename){                                ##Checking if prev_filename is NOT NULL then do following.
    close(temp_out)                                 ##Closing prev_out file here from back-end.
    sub(/\.txt/,".out",prev_filename)               ##Substitute .txt with .out in previous filename here.
    system("mv -- \047" prev_out "\047 \047" prev_filename "\047")        ##Using system command to rename temp file prev_out with prev_filename(with .out)
  }
}
' *.txt                                             ##Mentioning all .txt files here.

关于for-loop - 我可以用 awk "for loop"执行 '!seen[$0]' 来应用于多个 .txt 文件吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61847258/

相关文章:

javascript - 在函数的 for 循环内动态更改元素的内部文本

bash - 来自 curl 输出的 AWK 时间、日期、状态和顺序

使用\r\n 时的 Python 额外行写入(在 VS 代码中)

c - 循环结构: FOR in C

awk - 使用 awk 更改与分隔符之间的模式匹配的行

perl - 如何仅在您指定的某个任意字段上进行正则表达式替换

linux - 使用 Linux 删除短于 4 个字符的单词

bash - 如何使用 AWK 格式化字符串日期(带有文本和毫秒)

python for循环只执行一次?

java - 使用double而不是int循环