我正在使用 sed 将 14 个不同的缩写(例如 CA_23456、CB_scaffold34532...)替换为文件中的“正确”名称,并且可以将所有缩写放在一行中。
acc=$1
sed -e 's/CA_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_arizonica/;s/CB_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_bakeri/;s/CM_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_macrocarpa/;s/CS_[A-Z]*[a-z]*[0-9]*/Cupressus_sempervirens/;s/CT_[A-Z]*[a-z]*[0-9]*/Cupressus_torulosa/;s/JD_[A-Z]*[a-z]*[0-9]*/Juniperus_drupacea/;s/JF_[A-Z]*[a-z]*[0-9]*/Juniperus_flaccida/;s/JI_[A-Z]*[a-z]*[0-9]*/Juniperus_indica/;s/JP_[A-Z]*[a-z]*[0-9]*/Juniperus_phoenicea/;s/JX_[A-Z]*[a-z]*[0-9]*/Juniperus_procera/;s/JS_[A-Z]*[a-z]*[0-9]*/Juniperus_scopulorum/;s/MD_[A-Z]*[a-z]*[0-9]*/Microbiota_decussata/;s/XN_[A-Z]*[a-z]*[0-9]*/Xanthocyparis_nootkatensis/;s/XV_[A-Z]*[a-z]*[0-9]*/Xanthocyparis_vietnamensis/' ${acc}.nex > ${acc}_replaced.nex
为了使其更具可读性,我希望使用“\”将命令拆分为多行(为简洁起见,并未显示所有替换内容)
acc=$1
sed -e 's/CA_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_arizonica/;\
s/CB_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_bakeri/;\
s/CM_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_macrocarpa/'\
${acc}.nex > ${acc}_replaced.nex
但是,我收到一条错误消息:sed:-e表达式#1,字符168:未终止的地址正则表达式。我已经在各种网络论坛上查看了类似问题的答案,并尝试了各种方法(在每一行上使用's/.../.../',留下';',......)但我不能让它发挥作用。我做错了什么?
最佳答案
删除转义换行符的\
。 (他们实际上并没有这样做!,他们被 sed 解释为错误的语法)。不过,我建议将其放入文件中并像这样运行:
sed -f script.sed input
其中 script.sed
如下所示:
s/CA_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_arizonica/
s/CB_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_bakeri/
s/CM_[A-Z]*[a-z]*[0-9]*/Hesperocyparis_macrocarpa/
关于sed 多行命令不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29817227/