string - 将两个不同列上第 n 次出现的 'foo' 和 'bar' 替换为相应列中所提供文件的第 n 行的数字

我有一个 source.txt如下包含两列数据的文件。 source.txt 的列格式包括[ ] (方括号)，如我的 source.txt 所示:

[hot] [water]
[16] [boots and, juice]

我还有一个 target.txt文件并包含空行加上每行末尾的句点:

the weather is today (foo) but we still have (bar). 

= (

the next bus leaves at (foo) pm, we can't forget to take the (bar).

我想替换 foo target.txt 的第 n 行与source.txt的第一列的“各自的内容” , 并替换 bar target.txt 的第 n 行与source. txt的第二列的“各自的内容” .

我尝试搜索其他来源并了解我将如何做，起初我已经有一个命令可以用来替换 "replace each nth occurrence of 'foo' by numerically respective nth line of a supplied file"但我无法适应它:

awk 'NR==FNR {a[NR]=$0; next} /foo/{gsub("foo", a[++i])} 1' source.txt target.txt > output.txt;

我记得看到过一种使用包含两列数据的 gsub 的方法，但我不记得究竟有什么区别。

EDIT POST:有时阅读时会在它们之间加上一些符号 =和 (和 )在 target.txt 文本中。我添加了这个符号，因为如果这些符号在 target.txt 中，一些答案将不起作用。文件

注:target.txt的个数|行，因此出现的次数 bar和 foo在这个文件中可能会有所不同，我只是展示了一个示例。但是两者的出现次数 foo和 bar每行分别为1。

最佳答案

使用您展示的示例，请尝试以下答案。用 GNU awk 编写和测试。

awk -F'\\[|\\] \\[|\\]' '
FNR==NR{
  foo[FNR]=$2
  bar[FNR]=$3
  next
}
NF{
  gsub(/\<foo\>/,foo[++count])
  gsub(/\<bar\>/,bar[count])
}
1
' source.txt FS=" " target.txt

说明: 为以上添加详细说明。

awk -F'\\[|\\] \\[|\\]' '       ##Setting field separator as [ OR ] [ OR ] here.
FNR==NR{                        ##Checking condition FNR==NR which will be TRUE when source.txt will be read.
  foo[FNR]=$2                   ##Creating foo array with index of FNR and value of 2nd field here.   
  bar[FNR]=$3                   ##Creating bar array with index of FNR and value of 3rd field here.
  next                          ##next will skip all further statements from here.
}
NF{                             ##If line is NOT empty then do following.
  gsub(/\<foo\>/,foo[++count])  ##Globally substituting foo with array foo value, whose index is count.
  gsub(/\<bar\>/,bar[count])    ##Globally substituting bar with array of bar with index of count.
}
1                               ##printing line here.
' source.txt FS=" " target.txt  ##Mentioning Input_files names here.

编辑: 还添加以下解决方案，它将处理源中出现 n 次 [...] 并在目标中匹配它们文件也。因为这是 OP 的工作解决方案(在评论中确认)，所以在此处添加。当 source.txt 包含 & 时，也公平警告这将失败。

awk '
FNR==NR{
  while(match($0,/\[[^]]*\]/)){
    arr[++count]=substr($0,RSTART+1,RLENGTH-2)
    $0=substr($0,RSTART+RLENGTH)
  }
  next
}
{
  line=$0
  while(match(line,/\(?[[:space:]]*(\<foo\>|\<bar\>)[[:space:]]*\)?/)){
    val=substr(line,RSTART,RLENGTH)
    sub(val,arr[++count1])
    line=substr(line,RSTART+RLENGTH)
  }
}
1
' source.txt target.txt

关于string - 将两个不同列上第 n 次出现的 'foo' 和 'bar' 替换为相应列中所提供文件的第 n 行的数字，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/68702193/

string - 将两个不同列上第 n 次出现的 'foo' 和 'bar' 替换为相应列中所提供文件的第 n 行的数字

上一篇：python - 如何用 Pandas 构建矢量化函数？

下一篇：r - 基于一个列模态和其他列的新列