ruby - 使用 ruby mechanize 遍历 html 元素

即使在我尝试做的事情上，我也可能离我很远，但这是我得到的。

我想收集梦幻足球投影并将所有 html 元素(针对不同球员)放入一个数组中并遍历它们以显示结果。

require 'mechanize'

mechanize = Mechanize.new

dk_qb = mechanize.get('http://www.numberfire.com/nfl/fantasy/fantasy-football-projections/qb')



dk_qb_array = ['#container > div > div > div:nth-child(2) > div.fl.clearfix > h2',
            '#container > div > div > div:nth-child(3) > div.fl.clearfix > h2']

dk_qb_array.each do |name|
 require 'mechanize'
 mechanize = Mechanize.new

 dk_qb = mechanize.get('http://www.numberfire.com/nfl/fantasy/fantasy-football-projections/qb')

 puts "#{dk_qb}.at('#{name}').text.strip"

结尾

returns ==> #<Mechanize::Page:0x007f9ed95058f0>.at('#container > div > div > div:nth-child(2) > div.fl.clearfix > h2').text.strip

            #<Mechanize::Page:0x007f9ed91382e0>.at('#container > div > div > div:nth-child(3) > div.fl.clearfix > h2').text.strip

我一次一个地工作，但任何关于迭代更多元素的建议将不胜感激。

最佳答案

你不需要在这里再次要求 Mechanize :

dk_qb_array.each do |name|
  require 'mechanize'

无论如何，您应该使用 Nokogiri——而不是 Mechanize:

$ gem install nokogiri

然后:

require 'nokogiri'
require 'open-uri'

selectors = [
  '#container > div > div > div:nth-child(2) > div.fl.clearfix > h2',
  '#container > div > div > div:nth-child(3) > div.fl.clearfix > h2',
]
url = 'http://www.numberfire.com/nfl/fantasy/fantasy-football-projections/qb'
doc = Nokogiri::HTML(open(url))

selectors.each do |selector|
  puts selector
  doc.css(selector).each do |matching_tag|
    puts "\t #{matching_tag.text}"
  end
end


--output:--
#container > div > div > div:nth-child(2) > div.fl.clearfix > h2
     Week 1 Fantasy Football QB Projections
#container > div > div > div:nth-child(3) > div.fl.clearfix > h2

从输出中可以看出，第二个选择器没有匹配项；并且您的第一个选择器的单个匹配可能不是您想要的。只有一个 <h2>在整个页面上，所以寻找第二个是行不通的。

更好的方法是使用 id 属性直接进入您想要的区域，例如

"tbody#projection-data > tr"

然后做这样的事情:

doc.css("tbody#projection-data > tr").each do |tr|
  #The <tr> contains the data for one player

  tr.css('td').each do |td|  #Now step through the <td>'s for the given <tr>/player
    puts td.text.strip
  end

  puts '-' * 10  #Marks the end of the data for one <tr>/player

  #Now, loop back up and get the next <tr>/player
end


--output:--
Drew Brees (QB, NO)
ATL
#26
1
1
27.49/40.64
335.01
3.07
0.71
2.84
10.42
0.06
17.4-33.58
25.49
29.33
$0
0
26.25
$0
0
25.54
$0
0
25.54
$0
0
26.25
$0
0
----------
Peyton Manning (QB, DEN)
...
...

关于ruby - 使用 ruby mechanize 遍历 html 元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25315961/

ruby - 使用 ruby mechanize 遍历 html 元素

上一篇：perl - 使用从另一个模块继承的方法

下一篇：html - Perl Mechanize : Get the response page after the page is modified?

ruby - 使用 ruby​​ mechanize 遍历 html 元素

上一篇：perl - 使用从另一个模块继承的方法

下一篇：html - Perl Mechanize : Get the response page after the page is modified?

ruby - 使用 ruby mechanize 遍历 html 元素