我想制作一个动态网站,需要一些来自互联网的图片。我决定将它们从 flickr 上删除,并将所有者包括在我的网站上,但在删除时遇到了问题。我将在下面发布部分 HTML,但如果您想自己查看源代码,请访问此处的网站。 https://www.flickr.com/explore
HTML:
<div class="thumb ">
<span class="photo_container pc_ju">
<a data-track="photo-click" href="/photos/sheilarogers13/15586482942/in/explore-2014-10-20" title="Lake District" class="rapidnofollow photo-click"><img id="photo_img_15586482942" src="https://c2.staticflickr.com/4/3945/15586482942_6a7154363f_z.jpg"width="508" height="339" alt="Lake District" class="pc_img " border="0"><div class="play"></div></a>
</span>
<div class="meta">
<div class="title"><a data-track="photo-click" href="/photos/sheilarogers13/15586482942/in/explore-2014-10-20" title="Lake District" class="title">Lake District</a></div>
<div class="attribution-block">
<span class="attribution">
<span>by </span>
******<a data-track="owner" href="/photos/sheilarogers13" title="sheilarogers22" class="owner">sheilarogers22</a>******
</span>
</div>
<span class="inline-icons">
<a data-track="favorite" href="#" class="rapidnofollow fave-star-inline canfave" title="Add this photo to your favorites?"><img width="12" height="12" alt="[★]" src="https://s.yimg.com/pw/images/spaceball.gif" class="img"><span class="fave-count count">99+</span></a>
<a title="Comments" href="#" class="rapidnofollow comments-icon comments-inline-btn">
<img width="12" height="12" alt="Comments" src="https://s.yimg.com/pw/images/spaceball.gif">
<span class="comment-count count">57</span>
</a>
<a href="#" data-track="lightbox" class="rapidnofollow lightbox-inline" title="View in light box"><img width="12" height="12" alt="" src="https://s.yimg.com/pw/images/spaceball.gif"></a>
</span>
</div>
</div>
我想要放星号的那一行,以便能够注明图片的作者。
我的代码:
Elements pgElem = doc.select("div.thumb").select("div.meta").select("[data-track]");
不过,上面的代码在我的 div.meta 中提供了所有 4 个数据轨道,而我只想要 =owner 的那个。
我查看了 JSoup 文档,它说使用 [attr=value]
可以找到具有值的属性,但我似乎无法让它工作。我试过:
.select("[data-track=owner]")
.select("[data-track='owner']")
但都不起作用。想法?
最佳答案
Elements pgElem = doc.select("div.thumb").select("div.meta").select("[data-track]");
Elements ownerElements = new Elements();
for(Element element:pgElem){
if(!element.getElementsByAttributeValueContaining("data-track","owner").isEmpty()){
ownerElements.add(element);
}
}
实际上,我只是给了它另一个旋转,这对我有用:
doc.select("div.thumb").select("div.meta").select("[data-track=owner]")
关于java - JSoup 按属性值抓取 HTML 文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26498496/