我一直在寻找 stackoverflow,但找不到遇到此类问题的人。
我想做这样的事情:
输入字符串:
<?xml version="1.0" encoding="UTF-8" ?>
<List>
<Object>
<Section>Fruit</Section>
<Category>Bananas</Category>
<Brand>Chiquita</Brand>
<Obs><p>
Vende-se a peças ou o conjunto.</p><br>
</Obs>
</Object>
</List>
我想要的是去掉 html 标签,比如 <p>,<br>
等等 所以它像这样结束:
<?xml version="1.0" encoding="UTF-8" ?>
<List>
<Object>
<Section>Fruit</Section>
<Category>Bananas</Category>
<Brand>Chiquita</Brand>
<Obs>
Vende-se a peças ou o conjunto.
</Obs>
</Object>
</List>
我一直在玩弄 JSoup,但我似乎无法让它正常工作。
这是我的代码:
Whitelist whitelist = Whitelist.none();
String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?><List><Object><Section>Fruit</Section><Category>Bananas</Category><Brand>Chiquita</Brand><Obs><p>Vende-se a peças ou o conjunto.</p><br></Obs></Object></List>";
whitelist.addTags(new String[]{"?xml", "List", "Object", "Section", "Category", "Brand", "Obs"});
String safe = Jsoup.clean(xml, whitelist);
这是我得到的结果:
FruitBananasChiquitaVende-se a peças ou o conjunto.
提前致谢
最佳答案
标签是小写的,使用:
whitelist.addTags(new String[] { "?xml", "list", "object", "section",
"category", "brand", "obs" });
输出:
<list>
<object>
<section>
Fruit
</section>
<category>
Bananas
</category>
<brand>
Chiquita
</brand>
<obs>
Vende-se a peças ou o conjunto.
</obs></object>
</list>
关于java - JSoup 从 xml 中剥离 html 标记,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21833738/