我正在编写 awk 脚本来确定字数。
awk '$1 ~/the/ {++c}END{print c}' FS=: br.txt
awk '$1 ~/not/ {++c}END{print c}' FS=: br.txt
awk '$1 ~/that/ {++c}END{print c}' FS=: br.txt
并对输出进行格式化,因此标题将是“the not that”,并且它们下面的行必须是每个单词的编号。我正在使用这个:
awk 'BEGIN { print "the not that"<br/>
{ printf "%-10s %s\n", $1, $1 }}' br.txt
问题是我无法获取单词下一行的单词数。我应该更改或添加什么? 感谢您的努力
最佳答案
这是一个 awk
,它应该可以满足您的需要。
awk '$1~/the/ {the++} $1~/not/ {not++} $1~/that/ {that++} END {print "the","not","that\n"the,not,that}' FS=: OFS="\t" br.txt
这是它的工作原理:
awk '
$1~/the/ {the++} # If field `1` contains `the` and `1` to variable `the`
$1~/not/ {not++} # If field `1` contains `not` and `1` to variable `not`
$1~/that/ {that++} # If field `1` contains `that` and `1` to variable `that`
END { # When all file is read, do
print "the","not","that\n"the,not,that} # Print header, and the value of variable `the,not,that`
' FS=: OFS="\t" br.txt # Input field separator = `:`. Output separator = `<tab>`. Read file
关于linux - 格式化awk的输出,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28652094/