ruby - Nokogiri 在函数中抛出异常但不在函数外

我是 Ruby 的新手，正在使用 Nokogiri 来解析 html 网页。当函数执行到以下行时会抛出一个错误:

currentPage = Nokogiri::HTML(open(url))

我已经验证了函数的输入，url 是一个带有网址的字符串。我之前提到的那行在函数外部使用时完全按照预期工作，但在函数内部则不然。当它到达函数内部的那一行时，会抛出以下错误:

WebCrawler.rb:25:in `explore': undefined method `+@' for #<Nokogiri::HTML::Document:0x007f97ea0cdf30> (NoMethodError)
from WebCrawler.rb:43:in `<main>'

下面粘贴了有问题的行所在的函数。

def explore(url)
    if CRAWLED_PAGES_COUNTER > CRAWLED_PAGES_LIMIT
            return
    end
    CRAWLED_PAGES_COUNTER++

    currentPage = Nokogiri::HTML(open(url))
    links = currentPage.xpath('//@href').map(&:value)

    eval_page(currentPage)

    links.each do|link|
            puts link
            explore(link)
    end
end

这是完整的程序(不会太长):

require 'nokogiri'
require 'open-uri'

#Crawler Params
START_URL = "https://en.wikipedia.org"
CRAWLED_PAGES_COUNTER = 0
CRAWLED_PAGES_LIMIT = 5

#Crawler Functions
def explore(url)
    if CRAWLED_PAGES_COUNTER > CRAWLED_PAGES_LIMIT
            return
    end
    CRAWLED_PAGES_COUNTER++

    currentPage = Nokogiri::HTML(open(url))
    links = currentPage.xpath('//@href').map(&:value)

    eval_page(currentPage)

    links.each do|link|
            puts link
            explore(link)
    end
end

def eval_page(page)
    puts page.title
end

#Start Crawling


explore(START_URL)

最佳答案

require 'nokogiri'
require 'open-uri'

#Crawler Params
$START_URL = "https://en.wikipedia.org"
$CRAWLED_PAGES_COUNTER = 0
$CRAWLED_PAGES_LIMIT = 5

#Crawler Functions
def explore(url)
    if $CRAWLED_PAGES_COUNTER > $CRAWLED_PAGES_LIMIT
            return
    end
    $CRAWLED_PAGES_COUNTER+=1

    currentPage = Nokogiri::HTML(open(url))
    links = currentPage.xpath('//@href').map(&:value)

    eval_page(currentPage)

    links.each do|link|
            puts link
            explore(link)
    end
end

def eval_page(page)
    puts page.title
end

#Start Crawling


explore($START_URL)

关于ruby - Nokogiri 在函数中抛出异常但不在函数外，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42633731/

ruby - Nokogiri 在函数中抛出异常但不在函数外

上一篇：ruby-on-rails - has_many 与 `limit` 的关联获取所有记录而不是提供的限制

下一篇：ruby - Rspec : expect vs expect with block - what's the difference?