csv - 如何在 Linux 上通过命令行为 CSV 文件中特定列的每一行添加前缀

标签 csv awk text-editor vi

我正在努力实现以下目标。

编辑前的文件。

column-1,  column-2,  column-3,  column-4,  column-5
Row-1-c1,  Row-1-c2,  Row-1-c3,  Row-1-c4,  Row-1-c5
Row-2-c1,  Row-2-c2,  Row-2-c3,  Row-2-c4,  Row-2-c5
Row-3-c1,  Row-3-c2,  Row-3-c3,  Row-3-c4,  Row-3-c5
Row-4-c1,  Row-4-c2,  Row-4-c3,  Row-4-c4,  Row-4-c5
Row-5-c1,  Row-5-c2,  Row-5-c3,  Row-5-c4,  Row-5-c5

编辑后的文件

column-1,   column-2,   column-3,           column-4,   column-5
Row-1-c1,   Row-1-c2,   Prefix-Row-1-c3,    Row-1-c4,   Row-1-c5
Row-2-c1,   Row-2-c2,   Prefix-Row-2-c3,    Row-2-c4,   Row-2-c5
Row-3-c1,   Row-3-c2,   Prefix-Row-3-c3,    Row-3-c4,   Row-3-c5
Row-4-c1,   Row-4-c2,   Prefix-Row-4-c3,    Row-4-c4,   Row-4-c5
Row-5-c1,   Row-5-c2,   Prefix-Row-5-c3,    Row-5-c4,   Row-5-c5

请注意,column-3 是为除列标题之外的每一行添加前缀的列。 我想知道哪个编辑器是最好使用的编辑器,并了解如何使用命令来获得所需的结果。

最佳答案

也许更好的问题是“您可以使用多少种不同的工具来完成这项工作?”

我可能会选择 awk 作为最简单的工具,它可以相当简单地完成这项工作:

awk -F, 'NR == 1 { print; OFS="," } NR > 1 { sub(/^ +/, "&Prefix-", $3); print }'

sub 操作在第 3 列开头的空格后面添加 Prefix-。该代码不会尝试调整第 1 行(标题)的内容;如果您想在 $3 之后添加空格,那么我想这可以完成任务(由于逗号的位置,您可以将额外的空格添加到第 1 行的第 4 列之前):

awk -F, 'NR == 1 { OFS=","; $4 = "       " $4; print }
         NR  > 1 { sub(/^ +/, "&Prefix-", $3); print }'

Do you know how to do the same thing with sed?

是的,像这样:

sed -e '  1s/^\(\([^,]*,[[:space:]]*\)\{3\}\)/\1       /' \
    -e '2,$s/^\(\([^,]*,[[:space:]]*\)\{2\}\)/\1Prefix-/' "$@"

第一个表达式处理第一行;它在第三列之后放置与前缀中一样多的空格(这里是“Prefix-”,所以它是 7 个空格)。第二个表达式处理剩余的行;它在第三列之前添加前缀。

要处理第 N 列而不是第 3 列,请将 \{2\} 中的 3 更改为 N,将 2 更改为 N-1。

我重新检查了第二个 Awk 脚本;它根据问题的样本数据为我生成正确的输出。因此,在其限制范围内,第一个 Awk 脚本也是如此。确保您使用的是 C shell 以外的其他东西(它会被多行带引号的字符串弄乱),并且您在复制时要小心。

输出示例

$ cat data
column-1,  column-2,  column-3,  column-4,  column-5
Row-1-c1,  Row-1-c2,  Row-1-c3,  Row-1-c4,  Row-1-c5
Row-2-c1,  Row-2-c2,  Row-2-c3,  Row-2-c4,  Row-2-c5
Row-3-c1,  Row-3-c2,  Row-3-c3,  Row-3-c4,  Row-3-c5
Row-4-c1,  Row-4-c2,  Row-4-c3,  Row-4-c4,  Row-4-c5
Row-5-c1,  Row-5-c2,  Row-5-c3,  Row-5-c4,  Row-5-c5
$ bash manglesed.sh data
column-1,  column-2,  column-3,         column-4,  column-5
Row-1-c1,  Row-1-c2,  Prefix-Row-1-c3,  Row-1-c4,  Row-1-c5
Row-2-c1,  Row-2-c2,  Prefix-Row-2-c3,  Row-2-c4,  Row-2-c5
Row-3-c1,  Row-3-c2,  Prefix-Row-3-c3,  Row-3-c4,  Row-3-c5
Row-4-c1,  Row-4-c2,  Prefix-Row-4-c3,  Row-4-c4,  Row-4-c5
Row-5-c1,  Row-5-c2,  Prefix-Row-5-c3,  Row-5-c4,  Row-5-c5
$ bash mangleawk.sh data
column-1,  column-2,  column-3,         column-4,  column-5
Row-1-c1,  Row-1-c2,  Prefix-Row-1-c3,  Row-1-c4,  Row-1-c5
Row-2-c1,  Row-2-c2,  Prefix-Row-2-c3,  Row-2-c4,  Row-2-c5
Row-3-c1,  Row-3-c2,  Prefix-Row-3-c3,  Row-3-c4,  Row-3-c5
Row-4-c1,  Row-4-c2,  Prefix-Row-4-c3,  Row-4-c4,  Row-4-c5
Row-5-c1,  Row-5-c2,  Prefix-Row-5-c3,  Row-5-c4,  Row-5-c5
$ cat manglesed.sh
sed -e '  1s/^\(\([^,]*,[[:space:]]*\)\{3\}\)/\1       /' \
    -e '2,$s/^\(\([^,]*,[[:space:]]*\)\{2\}\)/\1Prefix-/' "$@"
$ cat mangleawk.sh
awk -F, 'NR == 1 { OFS=","; $4 = "       " $4; print }
         NR  > 1 { sub(/^ +/, "&Prefix-", $3); print }' "$@"
$ awk -F, 'NR == 1 { print; OFS="," } NR > 1 { sub(/^ +/, "&Prefix-", $3); print }' data
column-1,  column-2,  column-3,  column-4,  column-5
Row-1-c1,  Row-1-c2,  Prefix-Row-1-c3,  Row-1-c4,  Row-1-c5
Row-2-c1,  Row-2-c2,  Prefix-Row-2-c3,  Row-2-c4,  Row-2-c5
Row-3-c1,  Row-3-c2,  Prefix-Row-3-c3,  Row-3-c4,  Row-3-c5
Row-4-c1,  Row-4-c2,  Prefix-Row-4-c3,  Row-4-c4,  Row-4-c5
Row-5-c1,  Row-5-c2,  Prefix-Row-5-c3,  Row-5-c4,  Row-5-c5
$

关于csv - 如何在 Linux 上通过命令行为 CSV 文件中特定列的每一行添加前缀,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24226003/

相关文章:

sql - 在 MySQL 表中搜索包含 CSV 数据的列以查找输入值的存在

linux - 在 Linux 的文本文件中用逗号替换空格

text - Sublime Text 2 : View working directory/directory of file

c# - 如何从 visual studio 中的文本编辑器中删除虚线?

awk - 如何使用awk拆分行?

vim - 你能告诉我 Vi 和 Vim 之间的关系和区别吗

Python 2.7 格式化数据以写入 csv

python - 使用 Python/numpy 过滤 CSV 数据

Ruby 1.9.3 - CSV.table 如何知道 CSV 文件中是否没有 header ?

linux - 从 ping -c 中提取平均时间