linux - 如何使用 shell 语言格式化文件中的行？

该程序的目的是使文件中的注释在同一列中开始。如果一行以 ; 开头那么它不会改变如果一行以代码开头，则；程序应该在 ; 之前插入空格所以它将从最远的同一列开始；

例如:

之前:

; Also change "-f elf " for "-f elf64" in build command. 
; 
section .data                    ; section for initialized data 
str: db 'Hello world!', 0Ah                   ; message string with new-line char 
                               ; at the end (10 decimal)

之后:

; Also change "-f elf " for "-f elf64" in build command.                # These two line don't change 
;                                                                       # because they start with ;
section .data                                 ; section for initialized data     
str: db 'Hello world!', 0Ah                   ; message string with new-line char
                                              ; at the end (10 decimal)

我是Linux和shell的初学者，到目前为止我已经掌握了

echo "Enter the filename"
read name

cat $name | while read line;
do ....

我们的老师告诉我们应该使用两个while循环；记录之前最长的长度；在第一个循环中，并在第二个 while 循环中进行更改。现在我不知道如何使用 awk 或 sed 来找到最长的长度；

有什么想法吗？

最佳答案

这是解决方案，假设您文件中的注释以第一个分号 (;) 不在字符串中开头:

$ cat tst.awk
BEGIN{ ARGV[ARGC] = ARGV[ARGC-1]; ARGC++ }
{
    nostrings = ""
    tail = $0
    while ( match(tail,/'[^']*'/) ) {
        nostrings = nostrings substr(tail,1,RSTART-1) sprintf("%*s",RLENGTH,"")
        tail = substr(tail,RSTART+RLENGTH)
    }
    nostrings = nostrings tail
    cur = index(nostrings,";")
}
NR==FNR { max = (cur > max ? cur : max); next }
cur > 1 { $0 = sprintf("%-*s%s", max-1, substr($0,1,cur-1), substr($0,cur)) }
{ print }

$ awk -f tst.awk file
; Also change "-f elf " for "-f elf64" in build command.
;
section .data                                  ; section for initialized data
str: db 'Hello; world!', 0Ah                   ; message string with new-line char
                                               ; at the end (10 decimal)

下面是您如何从一个简单的起点开始实现它(我在您的 Hello World! 字符串中添加了一个分号用于测试 - 确保使用它验证所有建议的解决方案)。

请注意，上面确实按照您的老师的建议在输入中包含 2 个循环，但是您不需要手动编写它们，因为 awk 会在每次读取文件时为您提供循环。如果您的输入文件包含制表符或类似内容，那么您需要提前删除它们，例如通过使用 pr -e -t。

以下是您如何实现上述目标:

如果你不能在其他上下文中使用分号，而不是作为评论的开始，那么你只需要:

$ cat tst.awk
{ cur = index($0,";") }
NR==FNR { max = (cur > max ? cur : max); next }
cur > 1 { $0 = sprintf("%-*s%s", max-1, substr($0,1,cur-1), substr($0,cur)) }
{ print }

您将作为 awk -f tst.awk file file 执行(是的，指定您的输入文件两次)。

如果您的代码可以在不是注释开头的上下文中包含分号，例如在字符串中间，那么您需要告诉我们如何识别注释开始与其他上下文中的分号，但它是否只能出现在字符串中的单引号之间，例如; 里面的 'Hello;世界!” 下面:

$ cat file
; Also change "-f elf " for "-f elf64" in build command.
;
section .data                    ; section for initialized data
str: db 'Hello; world!', 0Ah                   ; message string with new-line char
                               ; at the end (10 decimal)

那么在找到第一个分号(可能是评论的开始)之前，您需要用一系列空白字符替换每个字符串:

$ cat tst.awk
{
    nostrings = ""
    tail = $0
    while ( match(tail,/'[^']*'/) ) {
        nostrings = nostrings substr(tail,1,RSTART-1) sprintf("%*s",RLENGTH,"")
        tail = substr(tail,RSTART+RLENGTH)
    }
    nostrings = nostrings tail
    cur = index(nostrings,";")
}
...the rest as before...

最后，如果您不想在命令行中指定文件名两次，只需在 ARGV[] 数组中通过在顶部添加以下行来复制它的名称:

BEGIN{ ARGV[ARGC] = ARGV[ARGC-1]; ARGC++ }

关于linux - 如何使用 shell 语言格式化文件中的行？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27210279/

linux - 如何使用 shell 语言格式化文件中的行？

上一篇：每个数据 block 的 linux awk 和 grep 特定行

下一篇：c++ - 使用 Linux 时出现段错误，但在 Xcode 中没有