perl - 使用perl从文本中提取段落

我想从从数据库中检索到的文本变量中提取段落。

为了从文件处理程序中提取 pargaphs，我使用以下代码:

local $/ = undef;
@paragarphs =<STDIN>

使用 perl 从文本变量中提取段落的最佳选择是什么？cpan 上是否有执行此类任务的模块？

最佳答案

你快到了。将 $/ 设置为 undef 将一次性读入整个文本。

你想要的是 local $/= ""; 启用段落模式，根据 perldoc perlvar (强调我自己):

$/

The input record separator, newline by default. This influences Perl's idea of what a "line" is. Works like awk's RS variable, including treating empty lines as a terminator if set to the null string (an empty line cannot contain any spaces or tabs). You may set it to a multi-character string to match a multi-character terminator, or to undef to read through the end of file. Setting it to "\n\n" means something slightly different than setting to "" , if the file contains consecutive empty lines. Setting to "" will treat two or more consecutive empty lines as a single empty line. Setting to "\n\n" will blindly assume that the next input character belongs to the next paragraph, even if it's a newline.

当然，可以从字符串而不是文件中获取文件句柄:

use strict;
use warnings;
use autodie;

my $text = <<TEXT;
This is a paragraph.

Here's another one that 
spans over multiple lines.

Last paragraph
TEXT

local $/ = "";
open my $fh, '<', \$text;

while ( <$fh> ) {

    print "New Paragraph: $_";
}

close $fh;

输出

New Paragraph: This is a paragraph.

New Paragraph: Here's another one that
spans over multiple lines.

New Paragraph: Last paragraph

关于perl - 使用perl从文本中提取段落，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12561822/

perl - 使用perl从文本中提取段落

上一篇：ruby-on-rails - OpenSSL::SSL::SSLError SSL_connect returned=1 errno=0 state=SSLv3 读取服务器证书 B:证书验证失败

下一篇：c - 如何在C程序中将数组中的元素设置为空