linux - Bash:如何从 txt 文件中读取行,按列值识别和删除重复项

标签 linux bash line grouping

我正在编写一个 Bash 脚本,我在其中从一些日志文件中收集数据,删除重复值,按用户名对其进行排序并将输出存储在 output.txt 中。 接下来我想做的是逐行读取 output.txt,每当用户名多次出现时,创建一个新行,其中包含来自两行的数据。

这样做的目的是仅向用户发送一封电子邮件,通知他不能在此服务器上使用此功能。

不知道我解释的好不好.. 例如,请参见下面的 output.txt

输出.txt

13:49:19 DENIED: "Software_1" UserA serv7 (Can't run this feature. )
13:49:19 DENIED: "Software_2" UserA serv7 (Can't run this feature. )
15:09:14 DENIED: "Software_3" UserB serv5 (Can't run this feature. )
15:09:15 DENIED: "Software_4" UserB serv5 (Can't run this feature. )
17:20:43 DENIED: "Software_3" UserC serv5 (Can't run this feature. )
17:20:43 DENIED: "Software_5" UserC serv8 (Can't run this feature. )

预期结果

Software_1, Software_2, UserA serv7 (Can't run this feature. )
Software_3, Software_4, UserB serv5 (Can't run this feature. )
Software_3, Software_5, UserC serv5, serv8 (Can't run this feature. )

有人可以提出解决方案并解释它是如何工作的吗?

最佳答案

$ cat userlog_unparsed.log
13:49:19 DENIED: "Software_1" UserA serv7 (Can't run this feature. )
13:49:19 DENIED: "Software_2" UserA serv7 (Can't run this feature. )
15:09:14 DENIED: "Software_3" UserB serv5 (Can't run this feature. )
15:09:15 DENIED: "Software_4" UserB serv5 (Can't run this feature. )
17:20:43 DENIED: "Software_3" UserC serv5 (Can't run this feature. )
17:20:43 DENIED: "Software_5" UserC serv8 (Can't run this feature. )


$ awk '
     { sws[$4][$3]++; srvs[$4][$5]++; }
     END{
         for(user in sws){
             swuser="";srvuser="";
             for(sw in sws[user]){swuser=swuser","sw}
             for(srv in srvs[user]){srvuser=srvuser","srv};
             print substr(swuser,2) ", " user ", " substr(srvuser,2);
         }
     }' userlog_unparsed.log

"Software_2","Software_1", UserA, serv7
"Software_3","Software_4", UserB, serv5
"Software_3","Software_5", UserC, serv5,serv8

解释:

  1. 记录所有用户及其软件、服务器。
  2. 最后,遍历所有这些,并附加用户、他们各自的服务器和软件。并打印出来。

关于linux - Bash:如何从 txt 文件中读取行,按列值识别和删除重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27419591/

相关文章:

delphi - Delphi:通过滚动滚动在TRichEdit中的中心特定线

jenkins - Ansible 无法读取 Jenkins 多行参数传递的带有空格的文件名

linux -/usr/bin/time 使用 SPARK 时针对 TOP 的 CPU 利用率

linux - 需要帮助在 shell 脚本中查找 PID

python - Python:从单独的线中的点绘制一条垂直线

bash - 如何使用 bash 记录 dmidecode 结果

bash - Shell 脚本 - CURL 脚本返回错误

linux - 哪个 JDK 的发行版可以运行 `javac -source 1.6 -target 1.5` ?

c++ - Linux/Ubuntu 中的 OpenCV 安装

linux - 使用 grep 从文件夹中的文件中删除确切的字符串