html - perl 不在命令提示符下打印特殊字符

标签 html perl encoding command-line special-characters

您好,我希望在命令提示符和 html 文件中打印结果。我在用 HTML 打印时使用了 encoding(cp1252),但是我在命令提示符中看不到那些特殊字符,而是得到了一些垃圾值。例如,“£”打印为“ú”。提前致谢

use strict;
use warnings;
use LWP::Simple;
use HTML::TreeBuilder::XPath;
use LWP::UserAgent;

my $competitor_declare='7shop';
my $xpath_declare='//strong';
my @urls = ("http://www.7dayshop.com/delivery-and-returns"); 


open HTML1, '>:encoding(cp1252)',"C:/Users/jeyakuma/Desktop/$competitor_declare.html";
open HTML, '>:encoding(cp1252)',"C:/Users/jeyakuma/Desktop/shipping project/database/$competitor_declare.html";  


foreach my $url (@urls)
        {
        print "\n\nworking on $url\n\n";
        my $ua = LWP::UserAgent->new( agent => "Mozilla/5.0" );
        my $req = HTTP::Request->new( GET => "$url" );
        my $res = $ua->request($req);

        if ( $res->is_success ) 
        {
           print "Please wait while we create file \n\n";
            my $xp = HTML::TreeBuilder::XPath->new_from_url($url);
           my $node = $xp->findnodes_as_string("$xpath_declare") or print "couldn't find the node\n"; #give xpath
            print HTML1 $node and print "Dump file is created please configure the same in xpathconfiguration.pl\n" and print HTML $node;
            print $node;
        }
        else{  
                print "file creation failed\n";

        }
}

命令提示符中的预期输出

cost is - £1.99

当前结果

cost is - ú1.99

最佳答案

7dayshop使用 utf8 字符集:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

要阅读 utf8,您需要在 Windows 机器上做两件又 ½ 件事。在控制台上:

  1. 修改您的 encoding使用以下内容的 STDOUT:

    binmode STDOUT, ':utf8:raw';
    
  2. 在运行脚本之前使用以下命令更改控制台的编码:

    chcp 65001
    
  3. 您可能需要将控制台的字体编辑为类似 Lucida Console 的字体。

下面演示了我在 Windows 机器上的输出:

use strict;
use warnings;
use autodie;

use LWP::Simple;
use HTML::TreeBuilder::XPath;
use LWP::UserAgent;

binmode STDOUT, ':utf8:raw';

my $competitor_declare = '7shop';
my $xpath_declare      = '//strong';
my @urls               = ("http://www.7dayshop.com/delivery-and-returns");

foreach my $url (@urls) {
    print "\n\nworking on $url\n\n";
    my $ua = LWP::UserAgent->new( agent => "Mozilla/5.0" );
    my $req = HTTP::Request->new( GET => "$url" );
    my $res = $ua->request($req);

    if ( $res->is_success ) {
        print "Please wait while we create file \n\n";
        my $xp = HTML::TreeBuilder::XPath->new_from_url($url);
        my $node = $xp->findnodes_as_string("$xpath_declare") or print "couldn't find the node\n";    #give xpath
        print $node;
    }
    else {
        print "file creation failed\n";
    }
}

输出:

working on http://www.7dayshop.com/delivery-and-returns

Please wait while we create file

Dump file is created please configure the same in xpathconfiguration.pl
<strong>JavaScript seem to be disabled in your browser.</strong>
<strong>7DAYSHOP.COM</strong>
<strong>Get weekly special offers and new product news</strong>
<strong id="cartHeader"><span class="hide" id="basket-btn">My Basket</span> <span class="number">(<span>0</span>)</span></strong>
<strong>UK Mainland, Highlands &amp; Islands,</strong>
<strong>Ireland (ROI) &amp; select European destinations</strong>
<strong>Deliveries to UK</strong>
<strong>UK Mainland Standard - £1.99</strong>
<strong>UK Mainland Standard Tracked - £2.99</strong>
<strong>UK Mainland Express Tracked - £3.99</strong>
<strong>UK Mainland DPD Express Courier - £5.99</strong>
<strong>Deliveries to Highlands and Islands and Channel Islands</strong>
<strong>Highlands and Islands Standard - £1.99</strong>
<strong>Channel Islands Standard - £1.99</strong>
<strong>Highlands and Islands Express Tracked - £3.99 (Not Channel Islands)<br /></strong>
<strong>Highlands and Islands DPD Express Courier - £14.99</strong>
<strong>Channel Islands DPD Express Courier - £14.99</strong>
<strong>Deliveries to Ireland (ROI)</strong>
<strong>Ireland Standard - £4.99</strong>
<strong>Ireland DPD Express Courier - £14.99</strong>
<strong>Deliveries to France (FR)</strong>
<strong>France Standard - £1.99</strong>
<strong>France DPD Express Courier - £8.49</strong>
<strong>Deliveries to Germany (DE)</strong>
<strong>Germany Standard - £1.99</strong>
<strong>Germany DPD Express Courier - £6.49</strong>
<strong style="color: #000080;">Shipping Restrictions</strong>
<strong>All orders outside of the shipping restrictions will only be able to use our DPD Courier shipping service.</strong>
<strong style="color: #000080;"><br />Standard Delivery<br /></strong>
<strong>Oversized</strong>
<strong>Lithium</strong>
<strong><a href="http://www.7dayshop.com/lithium-batteries" target="_blank">Click here for further information about Lithium Battery deliveries.<br /><br /></a>Adhesives<br /></strong>
<strong><br /></strong>
<strong style="color: #000080;"><br />RETURNS / MISSING ITEMS</strong>
<strong>Returns Address:</strong>
ress:</strong>

关于html - perl 不在命令提示符下打印特殊字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25061139/

相关文章:

jquery - 绝对居中 div 的优雅方式?

perl - 在 Perl 中,如何检查给定函数是从哪个模块导入的?

arrays - 如何在不描述每个元素的格式的情况下打印数组?

Php 在 linux 服务器上以错误的编码保存上传的文件

ruby-on-rails - ArgumentError(无法解析 Cookie header : invalid %-encoding (Req%201008%20-

javascript - Canvas 图像旋转、动态 Canvas 调整大小 - 图像被裁剪

php - 使用表格内的按钮通过 jQuery 打开模式

javascript - 为什么 if (element.innerHTML == "") 在 Firefox 中不起作用

perl - 标量上的实验键现在被禁止警告

java - 尝试编码但有问题