我正在尝试将一些(多行)git 历史信息(提取文件名更改)转换为 CSV 文件。这是我的 regex and sample file .它在该站点上运行良好。
正则表达式:
commit (.+)\n(?:.*\n)+?similarity index (\d+)+%\n(rename|copy) from (.+)\n\3 to (.+)\n
示例输入:
commit 2701af4b3b66340644b01835a03bcc760e1606f8
Author: ostrovsky.alex <ostrovsky.alex@a51b5712-02d0-11de-9992-cbdf800730d7>
Date: Sat Oct 16 20:44:32 2010 +0000
* Moved old sources to Maven src/main/java
diff --git a/alexo-chess/src/ao/chess/v2/move/Pawns.java b/alexo-chess/src/main/java/ao/chess/v2/move/Pawns.java
similarity index 100%
rename from alexo-chess/src/ao/chess/v2/move/Pawns.java
rename to alexo-chess/src/main/java/ao/chess/v2/move/Pawns.java
commit ea53898dcc969286078700f42ca5be36789e7ea7
Author: ostrovsky.alex <ostrovsky.alex@a51b5712-02d0-11de-9992-cbdf800730d7>
Date: Sat Oct 17 03:30:43 2009 +0000
synch
diff --git a/src/chess/v2/move/Pawns.java b/alexo-chess/src/ao/chess/v2/move/Pawns.java
similarity index 100%
copy from src/chess/v2/move/Pawns.java
copy to alexo-chess/src/ao/chess/v2/move/Pawns.java
commit b869f395429a2c1345ce100953bfc6038d9835f5
Author: ostrovsky.alex <ostrovsky.alex@a51b5712-02d0-11de-9992-cbdf800730d7>
Date: Wed Oct 7 22:43:06 2009 +0000
MctsPlayer works
diff --git a/ao/chess/v2/move/Pawns.java b/src/chess/v2/move/Pawns.java
similarity index 100%
copy from ao/chess/v2/move/Pawns.java
copy to src/chess/v2/move/Pawns.java
commit 4c697c510f5154d20be7500be1cbdecbaf99495c
Author: ostrovsky.alex <ostrovsky.alex@a51b5712-02d0-11de-9992-cbdf800730d7>
Date: Wed Sep 23 15:06:17 2009 +0000
* synch
diff --git a/v2/move/Pawns.java b/ao/chess/v2/move/Pawns.java
similarity index 95%
rename from v2/move/Pawns.java
rename to ao/chess/v2/move/Pawns.java
index e0172a3..e3659c5 100644
--- a/v2/move/Pawns.java
+++ b/ao/chess/v2/move/Pawns.java
但是,当我尝试运行以下 perl
命令时(在 Windows 10 上的 git bash 中),我只得到一个匹配行(与示例中的 4 行相反,您可以看到在我上面链接的网站上)。
我知道这可能有些愚蠢,就像它需要循环一样。但我对 slurping -0777
和多次应用模式感到困惑。我尝试了 -p
选项,但它打印出整个输入,我只想查看 print
的输出(即 CSV 行)。我还认为 /g
会使模式多次应用于输入文件,但由于 -0777
使它全部成为一行,我不再确定了。
<Pawns.java.history.txt perl -0777 -ne 'if (/commit (.+)\n(?:.*\n)+?similarity index (\d+)+%\n(rename|copy) from (.+)\n\3 to (.+)\n/g) { print $1.",".$2.",".$3.",".$4.",".$5."\n" }'
输出只有一行,而 sample file 总共应该是 4 行:
2701af4b3b66340644b01835a03bcc760e1606f8,100,rename,alexo-chess/src/ao/chess/v2/move/Pawns.java,alexo-chess/src/main/java/ao/chess/v2/move/Pawns.java
预期输出:
2701af4b3b66340644b01835a03bcc760e1606f8,100,rename,alexo-chess/src/ao/chess/v2/move/Pawns.java,alexo-chess/src/main/java/ao/chess/v2/move/Pawns.java
ea53898dcc969286078700f42ca5be36789e7ea7,100,copy,src/chess/v2/move/Pawns.java,alexo-chess/src/ao/chess/v2/move/Pawns.java
b869f395429a2c1345ce100953bfc6038d9835f5,100,copy,ao/chess/v2/move/Pawns.java,src/chess/v2/move/Pawns.java
4c697c510f5154d20be7500be1cbdecbaf99495c,95,rename,v2/move/Pawns.java,ao/chess/v2/move/Pawns.java
最佳答案
你只需要用while
转换你的if
:
perl -0777 -ne 'while (/commit (.+)\n(?:.*\n)+?similarity index (\d+)+%\n(rename|copy) from (.+)\n\3 to (.+)\n/g) { print $1.",".$2.",".$3.",".$4.",".$5."\n" }' file
2701af4b3b66340644b01835a03bcc760e1606f8,100,rename,alexo-chess/src/ao/chess/v2/move/Pawns.java,alexo-chess/src/main/java/ao/chess/v2/move/Pawns.java
ea53898dcc969286078700f42ca5be36789e7ea7,100,copy,src/chess/v2/move/Pawns.java,alexo-chess/src/ao/chess/v2/move/Pawns.java
b869f395429a2c1345ce100953bfc6038d9835f5,100,copy,ao/chess/v2/move/Pawns.java,src/chess/v2/move/Pawns.java
4c697c510f5154d20be7500be1cbdecbaf99495c,95,rename,v2/move/Pawns.java,ao/chess/v2/move/Pawns.java
关于regex - 多行正则表达式应在文件中匹配多次(如果可能,单行命令),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53764609/