我需要根据值检索 HTML 结构。 即我有这样的值(value):测试公司 IT
我需要使用 java jsoup 检索基于上述值的结构
结构如下
<div class="search_test">
<div class="inside_test">
<div class="s_test">
<h3 class="result-title">
<span>1.<a href="#">Test Company<span>IT</span></a></span>
</h3>
</div>
</div>
</div>
这个html格式的输出是这样的:Test Company IT 我需要此输出中的上述 html 格式。
最佳答案
可以这样回答:
class TEST {
public static void main(String arg[]){
new TEST();
}
TEST(){
try{
String html = "<div class=\"search_test\">\n" +
" <div class=\"inside_test\">\n" +
" <div class=\"s_test\">\n" +
" <h3 class=\"result-title\">\n" +
" <span>1.<a href=\"#\">Test Company<span>IT</span></a></span>\n" +
" </h3>\n" +
" </div>\n" +
" </div>\n" +
"</div>";
String companyName = "Test Company IT";
String comS ="";
Document docs = Jsoup.parse(html);
Elements div = docs.getElementsByIndexGreaterThan(0);
Elements d = div.first().children();//.first().children().first().children().first().children().first().children().first().children().text();
for(int i=0; i<d.size(); i++){
Elements sub = d.get(i).children();
if(sub.text().equalsIgnoreCase(companyName))
comS = sub.get(i).tagName();
else
for(int i1=0; i1<sub.size(); i1++){
Elements sub1 = sub.get(i1).children();
if(sub1.text().equalsIgnoreCase(companyName))
comS = sub.get(i).tagName() + " " +sub1.get(i1).tagName();
else
for(int i2=0 ;i2<sub1.size();i2++){
Elements sub2 = sub1.get(i2).children();
if(sub2.text().equalsIgnoreCase(companyName))
comS = sub.get(i).tagName() + " " +sub1.get(i1).tagName() + " " + sub2.get(i2).tagName();
else
for (int i3=0; i3<sub2.size();i3++){
Elements sub3 = sub2.get(i3).children();
if(sub3.text().equalsIgnoreCase(companyName))
comS = sub.get(i).tagName() + " " +sub1.get(i1).tagName() + " " + sub2.get(i2).tagName() + " " + sub3.get(i3).tagName();
else
for(int i4=0; i4<sub3.size();i4++){
Elements sub4 = sub3.get(i4).children();
if(companyName.equalsIgnoreCase(sub4.text())){
comS = sub.get(i).tagName() + " " +sub1.get(i1).tagName() + " " + sub2.get(i2).tagName() + " " + sub3.get(i3).tagName() + " "+ sub4.get(i4).tagName();
}
}
}
}
}
}
System.out.println("--> st : " + comS);
System.out.println("--> 111 " + div.select(comS).text());
} catch (Exception e){
e.printStackTrace();
}
}
}
关于java - 使用 jsoup java 从文本中检索 HTML 结构,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19177782/