bash - Unix 将不同的命名值解析为单独的行

我们得到了一个不同长度的输入文件，如下所述。文本长度不同。

输入文件:

ID|Text
1|name1=value1;name3;name4=value2;name5=value5
2|name1=value1;name2=value2;name6=;name7=value7;name8=value8

这里的文本以命名值对作为内容，并且长度不同。请注意，文本列中的名称可以包含分号。我们正在尝试解析输入，但无法通过 AWK 或 BASH 处理它

期望的输出:

1|name1=value1
1|name3;name4=value2
1|name5=value5
2|name1=value1
2|name2=value2
2|name6=
2|name7=value7
2|name8=value8

下面的代码片段适用于 ID=2，但不适用于 ID=1

echo "2|name1=value1;name2=value2;name6=;name7=value7;name8=value8" | while IFS="|"; read id text;do dsc=`echo $text|tr ';' '\n'`;echo "$dsc" >tmp;done
cat tmp
2|name1=value1
2|name2=value2
2|name6=
2|name7=value7
2|name8=value8

echo "1|name1=value1;name3;name4=value2;name5=value5" | while IFS="|"; read id text;do dsc=`echo $text|tr ';' '\n'`;echo "$dsc" >tmp;sed -i "s/^/${id}\|/g" tmp;done
cat tmp
1|name1=value1
1|name3
1|name4=value2
1|name5=value5

非常感谢任何帮助。

最佳答案

您能否尝试按照新版本的 GNU awk 中所示的示例进行编写和测试。由于 OP 的 awk 版本较旧，因此如果有人拥有旧版本的 awk，请尝试将其更改为 awk --re-interval

awk '
BEGIN{
  FS=OFS="|"
}
FNR==1{ next }
{
  first=$1
  while(match($0,/(name[0-9]+;?){1,}=(value[0-9]+)?/)){
    print first,substr($0,RSTART,RLENGTH)
    $0=substr($0,RSTART+RLENGTH)
  }
}'  Input_file

输出如下。

1|name1=value1
1|name3;name4=value2
1|name5=value5
2|name1=value1
2|name2=value2
2|name6=
2|name7=value7
2|name8=value8

说明:对上述内容添加详细说明(以下仅用于说明目的)。

awk '                                        ##Starting awk program from here.
BEGIN{                                       ##Starting BEGIN section from here.
  FS=OFS="|"                                 ##Setting FS and OFS wiht | here.
}
FNR==1{ next }                               ##If line is first line then go next, do not print anything.
{
  first=$1                                   ##Creating first and setting as first field here.
  while(match($0,/(name[0-9]+;?){1,}=(value[0-9]+)?/)){
##Running while loop which has match which has a regex of matching name and value all mentioned permutations and combinations.
    print first,substr($0,RSTART,RLENGTH)    ##Printing first and sub string(currently matched one)
    $0=substr($0,RSTART+RLENGTH)             ##Saving rest of the line into current line.
  }
}' Input_file                                ##Mentioning Input_file name here.

关于bash - Unix 将不同的命名值解析为单独的行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/64697767/

bash - Unix 将不同的命名值解析为单独的行

上一篇：java - 如何在 Java 中获取特定类型的异常

下一篇：terraform - 如何将变量传递给 heml.tf 中的 yaml 文件？