java - 使用 Jsoup 仅从子节点中选择?

标签 java html dom screen-scraping jsoup

我目前正在使用 <ul>一级元素很多<li>元素。我想获取那些元素,并且只获取那些元素。但是,当我使用 Jsoup 选择器或 getElementsByTag 获取它们时,它也会返回 <li>那些一级里面的元素<li>元素。

我该怎么做才能只获得第一级 <li>元素?

代码如下:

Elements bundleList =  indieGala.select("section.games_bundle_box2")
                        .get(0).select("ul.unlock")
                        .get(0).getElementsByTag("li");

这是 html:

<section class="games_bundle_box2">
  <div class="games-container">

    <!-- List Game Unlocked -->
    <ul class="unlock">

      <!-- Item -->
      <li>
        <!-- Preview Thumb -->
        <a href="#game1" class="fancybox-various" title="Desura &amp; Steam for Windows and Mac - This game has been GreenLighted on Steam and all buyers of The IndieGala Flashpoint bundle will receive Steam keys in a few weeks!">
          <span class="tier1">
            Pay minimum 
            <em class="color-text">
              $1
            </em>
            to get Steam &amp; Desura keys! 
          </span>
          <span class="boxed">
          </span>
          <span class="steam-temp">
          </span>
          <span class="desura-icon">
          </span>
          <img src="img_game/ig-flashpoint/knytt.jpg" alt="">
          <!-- 110 x 110 -->
          <span class="item-title">
            Knytt Underground
          </span>
        </a>
        <!-- End Preview Thumb -->

        <!-- Pop-Up -->

        <div id="game1" class="modal_content">

          <!-- Video Game Box -->
          <div class="game_movie">
            <iframe width="488" height="300" src="http://www.youtube.com/embed/NwZ2Z7WRQrI?rel=0&amp;vq=hd720" frameborder="0" allowfullscreen="">
            </iframe>
          </div>
          <!-- End Video Game Box -->

          <!-- Game Description -->
          <div class="game_description">
            <h2>
              <a href="http://www.desura.com/games/knytt-underground" target="_blank">
                Knytt Underground
              </a>
            </h2>
            <a href="http://nifflas.ni2.se/" target="_blank">
              Nifflas' Games
            </a>
            <p style="font-weight: bold;">
              IMPORTANT NOTE: This game has been accepted on Steam. As soon as it will available on the Steam store, all buyers will receive a Steam key for it.
            </p>
            <p>
              Immerse yourself in a unique and challenging underground adventure!
            </p>
            <p>
              Knytt Underground is the latest iteration in the critically acclaimed Knytt series. It takes place in the same universe as Knytt Stories, the renowned indie game from Nicklas Nygren, a.k.a. 'Nifflas'.
            </p>
            <p>
              Each room in the huge map offers a small challenge to overcome using the unique tried and tested mechanics that have already reached over 1 million fans worldwide. Blending zen-like gameplay; enjoy an unparalled sense of freedom by running, jumping, climbing, swinging and bouncing. Knytt Underground delivers a unique and captivating, platform adventure experience allowing you to explore over 1,800 rooms and complete multiple story-driven quests. 
            </p>
            <p>
              <a href="#" target="_blank">
                System requirements
              </a>
            </p>
            <p>
              Operating system: Windows (XP / Vista / 7 / 8) or Mac OS 10.6
              <br>
              OpenGL compatible graphics card with 512MB memory or more.
              <br>
              A 2ghz CPU is needed.
              <br>
              The OS X version requires Mac OS 10.6 and must be running in 64-bit mode.
              <br>
            </p>
          </div>
        </div>
        <!-- End Pop-Up -->
      </li>
      <!-- End Item -->

      <li>
        <a href="#game2" class="fancybox-various" title="Steam for Windows">
          <span class="boxed">
          </span>
          <span class="steam-icon">
          </span>
          <img src="img_game/ig-flashpoint/saira.jpg" alt="">
          <span class="item-title">
            Saira
          </span>
        </a>
        <div id="game2" class="modal_content">
          <div class="game_movie">
            <iframe width="488" height="300" src="http://www.youtube.com/embed/iFPaXcLF5b0?rel=0&amp;vq=hd720" frameborder="0" allowfullscreen="">
            </iframe>
          </div>
          <div class="game_description">
            <h2>
              <a href="http://store.steampowered.com/app/48900/" target="_blank">
                Saira
              </a>
            </h2>
            <a href="http://nifflas.ni2.se/" target="_blank">
              Nifflas' Games
            </a>
            <p>
              Saira is a puzzle platformer with non-linear gameplay and a whole universe for you to explore. The game is heavily influenced by classic puzzle adventure games and uses a new unique graphical style combining high resolution photography into a lush and mysterious world.
            </p>
            <p>
              The eponymous Saira is a photographer who specializes in digitally capturing dangerous places and animals across the universe. For reasons unknown, she finds herself as the only remaining person in the entire galaxy. Saira has no weapons, she will use only her mind and agility to progress through seven star systems and over 60 well-crafted puzzles. Over two hours of originally-scored music will help her maintain focus and unlock one of six vastly unique endings.
            </p>
            <ul>
              <li>
                Over 60 well-crafted puzzels.
              </li>
              <li>
                A universe full of surprising creatures and locations for you to explore.
              </li>
              <li>
                11 layers of high definition parallax scrolling and a two hour soundtrack gives the universe of Saira it's unique atmosphere.
              </li>
              <li>
                Non-linear gameplay
              </li>
              <li>
                Multiple endings
              </li>
            </ul>
            <p>
              <a href="#" target="_blank">
                System requirements
              </a>
            </p>
            <p>
              OS: Windows 2000 or later
              <br>
              Processor: 2.1Ghz or higher
              <br>
              DirectX®: 9
              <br>
              Hard Drive: 250 MB for downloading the game + 200 MB for playing
              <br>
            </p>
          </div>
        </div>
      </li>
      <!-- End Item -->

      <!-- Item -->
      <li>
        <!-- Preview Thumb -->
        <a href="#game3" class="fancybox-various" title="Steam for Windows">
          <span class="boxed">
          </span>
          <span class="steam-icon">
          </span>
          <img src="img_game/ig-flashpoint/musaic.jpg" alt="">
          <!-- 110 x 110 -->
          <span class="item-title">
            Musaic Box
          </span>
        </a>
        <!-- End Preview Thumb -->

        <!-- Pop-Up -->
        <div id="game3" class="modal_content">

          <!-- Video Game Box -->
          <div class="game_movie">
            <iframe width="488" height="300" src="http://www.youtube.com/embed/ZswOQE8MDbo?rel=0&amp;vq=hd720" frameborder="0" allowfullscreen="">
            </iframe>
          </div>
          <!-- End Video Game Box -->

          <!-- Game Description -->
          <div class="game_description">
            <h2>
              <a href="http://store.steampowered.com/app/29130/" target="_blank">
                Musaic Box
              </a>
            </h2>
            <a href="http://kranx.com/en/" target="_blank">
              KranX Productions
            </a>
            <br>
            <br>
            <p>
              Uncover all of your grandfather's sheet music, hidden in his home amongst a treasure trove of gorgeous antiques and musical relics. Melodious music box games will let you piece these special compositions together and unleash their symphonious secrets. Unlock Creative Mode and write your own outstanding arrangements. With a house full of secrets and a box full of music the aural excitement never ends.
            </p>
            <ul>
              <li>
                Beautifully rendered graphics
              </li>
              <li>
                Entrancing musical score
              </li>
              <li>
                Unlock musical mysteries!
              </li>
            </ul>
            <p>
              <a href="#" target="_blank">
                System requirements
              </a>
            </p>
            <p>
              Supported OS: Microsoft® Windows® 2000/XP/Vista
              <br>
              Processor: 800Mhz
              <br>
              Graphics: DirectX® compatible with 16 MB of Video Memory
              <br>
              Sound: DirectX® compatible Sound Card
              <br>
              DirectX® Version: DirectX 8.0
              <br>
              Memory: 256 MB RAM (512+MB recommended)
              <br>
              Hard Drive: 54 MB or more
              <br>
            </p>
          </div>
          <!-- End Game Description -->

        </div>
        <!-- End Pop-Up -->

      </li>
      <!-- End Item -->

      <!-- Item -->
      <li>
        <!-- Preview Thumb -->
        <a href="#game4" class="fancybox-various" title="Steam for Windows">
          <span class="boxed">
          </span>
          <span class="steam-icon">
          </span>
          <img src="img_game/ig-flashpoint/yumsters.jpg" alt="">
          <!-- 110 x 110 -->
          <span class="item-title">
            Yumsters 2: Around the World
          </span>
        </a>
        <!-- End Preview Thumb -->

        <!-- Pop-Up -->
        <div id="game4" class="modal_content">

          <!-- Video Game Box -->
          <div class="game_movie">
            <iframe width="488" height="300" src="http://www.youtube.com/embed/m19AcPO_LMQ?rel=0&amp;vq=hd720" frameborder="0" allowfullscreen="">
            </iframe>
          </div>
          <!-- End Video Game Box -->

          <!-- Game Description -->
          <div class="game_description">
            <h2>
              <a href="http://store.steampowered.com/app/29120/" target="_blank">
                Yumsters 2: Around the World
              </a>
            </h2>
            <a href="http://kranx.com/en/" target="_blank">
              KranX Productions
            </a>
            <p>
              Not only are these Yumsters crazy for strawberries, they can rock the bongos. For the love of fruity music, help them earn money by cleaning gardens to promote their band. To really skyrocket, Yumsters need the best equipment to win the ultimate grand prize at the fairy town music showdown. Get fruitilicious in five vibrant locations of Yumsters 2, a sweet Match 3 puzzler. 
            </p>
            <p>
              Key Features:
            </p>
            <ul>
              <li>
                Match 3, drag-n-drop 
              </li>
              <li>
                Rhythm-based gameplay 
              </li>
              <li>
                Adorable characters 
              </li>
              <li>
                Win the grand prize! 
              </li>
            </ul>
            <p>
              <a href="#" target="_blank">
                System requirements
              </a>
            </p>
            <p>
              Operating System: Microsoft® Windows® XP/Vista
              <br>
              Processor: 600 Mhz
              <br>
              Memory: 256 MB (512+ MB recommended)
              <br>
              Hard Disk Space: 54 MB (or More) Available HDD Space
              <br>
              Video Card: DirectX® compatible with 16 MB of Video Memory
              <br>
              Sound Card: DirectX® compatible Sound Card
              <br>
              DirectX® Version: DirectX® 7.0
              <br>
            </p>
          </div>
          <!-- End Game Description -->

        </div>
        <!-- End Pop-Up -->
      </li>
      <!-- End Item -->      
<li style="width: 400px;font-weight: bold; margin-top: 20px; position: relative; left: 210px; font-size: 16px;">
Price BLOCKED at 
<span class="color-text">
$3.99
</span>
for the FIRST 8 HOURS
</li>
-->
          </ul>

          <!-- End List Game Locked -->

  </div>
</section>

最佳答案

也许我忽略了一些东西,但是根据 ul.unlock 的直接后裔进行选择怎么样?您可以通过在选择器表达式中使用 > 来做到这一点:

parent > child: child elements that descend directly from parent, e.g. div.content > p finds p elements; and body > * finds the direct children of the body tag (source)

通过使用 ul.unlock > li,它应该只挑选出你想要的顶级 li 元素,例如:

ListIterator<Element> bundleList = indieGala.select("section.games_bundle_box2")
    .get(0)
    .select("ul.unlock > li")
    .listIterator();

assert Iterators.size(bundleList) == 5;

希望对您有所帮助。

关于java - 使用 Jsoup 仅从子节点中选择?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18797545/

相关文章:

java - 为struts radio的 "disabled"参数设置一个动态值

javascript - 0x80004005 (NS_ERROR_FAILURE) [nsIDOMHTMLFormElement.submit]

java - java中ExecutorService执行方法

java - 将 List<Object[]> 转换为 JSON

html - 在 CSS 中向上移动图像

jquery 水平滚动事件 - 编辑

html - 无法自定义使用 Bootstrap 3 创建的导航栏

javascript - 使用 javascript 的 .insertBefore 将项目作为最后一个 child 插入

javascript - 在浏览器中后退和前进时保留 Javascript 更改

java - 如何将十六进制字符串存储在 android 中的整数变量中???