perl - 将单独的回车符作为行尾符号处理

所以我有一个程序可以消除从网络复制和粘贴的 fasta 文件中的额外换行符。如果您不知道 fasta 文件应该是什么样子，它应该是一个大于符号，后跟任何内容(通常是标题信息)，然后是新行。新行应在一行中包含您的完整序列(生物学 DNA 或氨基酸)，然后重复。

无论如何，问题是我需要程序足够灵活来处理任何事情:\r、\n 或 \r\n。两侧带有下划线的 chomp 语句是删除序列部分中多余行的命令。我怎样才能让 chomp 摆脱所有三个选项(\r、\n、\r\n)？我可以设置 $\= @linefeeds 并让 @linefeeds = "\r", "\n", "\r\n"; 吗？

我在网上阅读过，我知道这个主题之前已经讨论过，但我似乎无法让它发挥作用。

这是我在文件中执行此操作的代码:

print "Please enter file name, using the full pathway, to save your cleaned fasta file to:\n";
chomp( $new_file = <STDIN> );
open( New_File, "+>$new_file" ) or die "Couldn't create file. Check permissions on location.\n";

#process the file line by line, chomping all lines that do not contain "greater than" and
#removing all white space from lines that do not contain "greater than"

my $firstline = 1;
while ( my $lines = <FASTA> ) {
    foreach ($lines) {
        if ( !/>/ ) {
            _chomp($lines);_
            $lines =~ s/ //g;
            print New_File "$lines";
        } else {
            if ( $firstline == 1 ) {
                print New_File "$lines";
                $firstline = 0;
            } else {
                print New_File "\n$lines";
                next;
            }
        }
    }
}

最佳答案

根本问题是 $/ 只能设置为单个字符串，并且没有可以将其设置为匹配所有 CR、LF 和 CRLF 行结尾的值。

但是，您不是第一个遇到此问题的人。我自己没有尝试过，但是如果你安装PerlIO::eol ，你应该能够说:

binmode FASTA, ":raw:eol(LF)";

它会自动将 CR、LF 或 CRLF 行结尾转换为 LF。

关于perl - 将单独的回车符作为行尾符号处理，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/4477167/

perl - 将单独的回车符作为行尾符号处理

上一篇：sql - 将字符串实例替换为 NULL

下一篇：dart - 在 Dartium 中调试 Dart - 断点关键字？