我需要你的帮助... 我收到了这样的文字:
2016.04.10 19:24:00,044 +0300 basdahsdjashd asjd ashdjkl [{"socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}]
2016.04.07 14:29:09,126 +0300 jsjdgdbcgf jjsgftr kksgcxdw2 [{"socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}]
2016.04.05 22:01:32,005 +0300 jafhaljdhf afs ljhsdhfl adf tng-customer-id=9303801442
2016.04.05 20:44:51,003 +0300 pppcndhfgus23 ofkgjg jdghhfye uksd tng-customer-id=2875223046
我需要的输出是(第一列和第二列以及 socialSecurityNumber 或 tng-customer-id):
2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514"
2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224"
2016.04.05 22:01:32,005 tng-customer-id=9303801442
2016.04.05 20:44:51,003 tng-customer-id=2875223046
所以问题是……是否可以使用 sed 命令解决这个问题?我在这里需要 OR 选项。
如果我尝试单独进行,首先找到 socialSecurityNumber,我得到:
wsslogfetcher ~/temp/log_parser$ sed 's/\([^+]*\).*\("socialSecurityNumber"[^,]*\).*/\1 \2/' testfile.txt
2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514"
2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224"
2016.04.05 22:01:32,005 +0300 jafhaljdhf afs ljhsdhfl adf tng-customer-id=9303801442
2016.04.05 20:44:51,003 +0300 pppcndhfgus23 ofkgjg jdghhfye uksd tng-customer-id=2875223046
其次,找到 tng-customer-id,我明白了:
wsslogfetcher ~/temp/log_parser$ sed 's/\([^+]*\).*\(tng-customer-id[^ ]*\).*/\1 \2/' testfile.txt
2016.04.10 19:24:00,044 +0300 basdahsdjashd asjd ashdjkl [{"socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}]
2016.04.07 14:29:09,126 +0300 jsjdgdbcgf jjsgftr kksgcxdw2 [{"socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}]
2016.04.05 22:01:32,005 tng-customer-id=9303801442
2016.04.05 20:44:51,003 tng-customer-id=2875223046
因此,如果您可以看到,在第一个示例中,当在最后两行中找不到 socialSecurityNumber 时,它只是将它们打印出来。在第二个例子中同样的情况......
当我尝试使用 OR 运算符来完成我的 sed 命令时,我得到了这个完全错误的输出:
wsslogfetcher ~/temp/log_parser$ sed 's/\([^+]*\).*\(\("socialSecurityNumber"[^,]*\).*\|\(tng-customer-id=[^ ]*\).*\)/\1 \2/' testfile.txt
2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514","socialSecurityNumberCountryCode":"EE"}]
2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224","socialSecurityNumberCountryCode":"EE"}]
2016.04.05 22:01:32,005 tng-customer-id=9303801442
2016.04.05 20:44:51,003 tng-customer-id=2875223046
那么...我做错了什么?
最佳答案
使用这个sed
:
sed 's/^\([^ ]*\) \([^ ]*\).*\("socialSecurityNumber":"[^"]*"\|tng-customer-id=[^ ]*\).*$/\1 \2 \3/g' file
测试:
$ sed 's/^\([^ ]*\) \([^ ]*\).*\("socialSecurityNumber":"[^"]*"\|tng-customer-id=[^ ]*\).*$/\1 \2 \3/g' a
2016.04.10 19:24:00,044 "socialSecurityNumber":"68888410106514"
2016.04.07 14:29:09,126 "socialSecurityNumber":"00299288282224"
2016.04.05 22:01:32,005 tng-customer-id=9303801442
2016.04.05 20:44:51,003 tng-customer-id=2875223046
从你的命令:
sed 's/\([^+]*\).*\(\("socialSecurityNumber"[^,]*\)\|\(tng-customer-id=[^ ]*\)\).*/\1 \2/'
我删除了每个按外部单个组分组的分组中的 .*
。这样,不匹配的字符串就不会被分组。
关于linux - 使用 sed 命令格式化输出,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36547919/