java - 如何在java中从Html中的Div标签中提取文本

标签 java java-me html-parsing

嗨,

我想提取 div 标记之间的文本

<div class="innercontenttxt"> 
<p>img border="1" align="left" height="170" width="324" vspace="3" hspace="2" src="/tmdbuserfiles/ramdev-balakrishna(1).jpg" alt="ramdev aide remanded, lakrishna acharya judicial remand, ramdev aide fake passport case, baba ramdev assistant judicial custody, balakrishna sent to judicial custody, yoga guru ramdev assistant remanded, yoga guru ramdev assistant balakrishna" />
Yoga guru Ramdev's aide Balakrishna Acharya remanded to 14 days judicial custody in a fake passport on Saturday. He was arrested yesterday after he failed to appear at a Dehradun court.
    <br />
    <br />
     Balakrishna Acharya, who is basically a Nepalese citizen, 
     is alleged to have submitted fake documents to procure a passport. 
     When he failed to appear in Dehradun court in connection with the case,
</p>  
</div>

解压后的结果应该是:

ramdev aide alakrishna Acharya remanded to 14 days judicial custody in a fake passport on Saturday. He was arrested yesterday after he failed to appear at a Dehradun court.Balakrishna Acharya, who is basically a Nepalese citizen, is alleged to have submitted fake documents to procure a passport. When he failed to appear in Dehradun court in connection with the case, the court had issued a non-bailable warrant and subsequently arrested him yesterday.

最佳答案

这个问题似乎与此类似other question

假设您已经将 html 源存储在名为 htmlPage 的字符串变量中。

int divIndex = htmlPage.indexOf("<div");
divIndex = htmlPage.indexOf(">", divIndex);

int endDivIndex = htmlPage.indexOf("</div>", divIndex);
String content = htmlPage.substring(divIndex + 1, endDivIndex);

关于java - 如何在java中从Html中的Div标签中提取文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11662505/

相关文章:

c# - HTML 敏捷包

java - 通用图像加载器 : Get ImageView Object From Tap

java - 如果两个日期相等,则 Date before 方法返回 false

java - 是否有 GZIP J2ME 库?

java - pc 到手机聊天使用蓝牙?

php - 使用 DOMDocument::saveHTML 避免自动关闭打开的 HTML 元素

python-3.x - Yandex.天气解析

java - 线程调度程序是 JVM 的一部分还是操作系统的一部分?

java - Hibernate:如何维护插入顺序

java - 如何获取j2me应用程序中的数据集?