java - 帮助用 Java 构建 RSS 阅读器

标签 java xml dom rss

对于一个类项目,我正在尝试为我的 Java 类编写一个简单的 RSS 阅读器。我尝试遍历 DOM 树只是为了获得这样做的经验,尽管我知道有更好、更有效的方法和工具。我有一个 ReaderObject,它获取基本标题、链接、描述和一个 List,用于保存具有实例变量标题、链接、描述、发布日期和 guid 的 RSSItem 对象。我希望通过这些信息,我可以以一种很好的方式解析并重新显示它。我被 RSSItem 部分困住了,因为我的文本在那里是空白的。我也不知道这是否是一个好的方法,如果我完全理解它......

另一个问题是,当你 getChildNodes 时,然后通过 for 循环,你得到每个项目,为什么此时需要 getFirstChild 。我从书上的例子中得到了这一点,但我不知道为什么。

这是我的代码:

Code: 
import java.io.*;
import java.util.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;


public class RSSReader {
    public static void main(String[] args) {
        File f = new File("testrss.xml");
        if (f.isFile()) {
            System.out.println("is File");
            RSSReader xml = new RSSReader(f);
        }
    }

    public RSSReader(File xmlFile) {
        try {
            obj = new ReaderObject();

            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            Document doc = builder.parse(xmlFile); // Document extends Node

            List<Node> nodeList = new ArrayList<Node>();
            nodeList.add(doc);

            while (nodeList.size() > 0)
            {
            Node node = nodeList.get(0);

//            if(node instanceof Document)
//                 System.out.println("Document Node");

            // Get entries in the xml file
            if (node.hasChildNodes()) {
                NodeList nl = node.getChildNodes();
                for(int i = 0; i < nl.getLength(); i++) {
                    if (nl.item(i) instanceof Element) {
                        Element childElement = (Element) nl.item(i);
                        nodeList.add(childElement);
                        //nodeList.add(nl.item(i));
                    }
                }
            }

            if (node instanceof Element) {
                // Print out the element tag name
                System.out.println("Element Node: " + ((Element)node).getTagName());

                // Print out the attributes of the element
                if (node.hasAttributes()) {
                    NamedNodeMap attrMap = node.getAttributes();
                    for (int i = 0; i < attrMap.getLength(); i++) {
                        Attr attribute = (Attr) attrMap.item(i);
                        System.out.print("\tAttribute Key: " + attribute.getName() + " Value: " + attribute.getValue());
                    }
                    System.out.println();
                }

                // Get children of node
                if (node.hasChildNodes()) {
                    NodeList childrenList = node.getChildNodes();
                    for (int j = 0; j < childrenList.getLength(); j++) {
                        Node child = childrenList.item(j);
                        Element childElement;
                        Text textNode;
                        if (child instanceof Element) {
                            childElement = (Element) child;
                            textNode = (Text) childElement.getFirstChild();
                            String text = textNode.getData().trim();
                            if (childElement.getTagName().toLowerCase().equals("title")) {

                                obj.setTitle(text);
                                System.out.println("Title: " + obj.getTitle());
                            }
                            else if (childElement.getTagName().toLowerCase().equals("link")) {
                                obj.setLink(text);
                                System.out.println("Link: " + obj.getLink());
                            }
                            else if (childElement.getTagName().toLowerCase().equals("description")) {
                                obj.setDescription(text);
                                System.out.println("Description: " + obj.getDescription());
                            }
                            else if (childElement.getTagName().toLowerCase().equals("item")) {
                                RSSItem item = new RSSItem();
                                System.out.println("item text: " + text); // STUCK HERE
                                item.setTitle(text);
                                System.out.println("RSS Item title: " + item.getTitle());
                            }
                        }
                    }
                }
            }

            nodeList.remove(0);
            }
        }
        catch (IOException e) {
            e.printStackTrace();
        }
        catch (SAXException e) {
            e.printStackTrace();
        }
        catch (IllegalArgumentException e) {
            e.printStackTrace();
        }
        catch (ParserConfigurationException e) {
            e.printStackTrace();
        }
    }
    private ReaderObject obj;
}
class ReaderObject {
    public ReaderObject() {
        this.title = "";
        this.link = "";
        this.description = "";
    }    

    public ReaderObject(String title, String link, String description) {

        this.title = title;
        this.link = link;
        this.description = description;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public void setLink(String link) {
        this.link = link;
    }

    public void setDescription(String description) {
        this.description = description;
    }

    public String getTitle() {
        return title;
    }

    public String getLink() {
        return link;
    }

    public String getDescription() {
        return description;
    }

    private String title;
    private String link;
    private String description;
    private List<RSSItem> items = new ArrayList<RSSItem>();
}

class RSSItem {

    public RSSItem() {

        this.title = "";
        this.link = "";
        this.description = "";
        this.pubDate = "";
        this.guid = "";
    }    

    public RSSItem(String title, String link, String description, String item, String pubDate, String guid) {

        this.title = title;
        this.link = link;
        this.description = description;
        this.pubDate = pubDate;
        this.guid = guid;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public void setLink(String link) {
        this.link = link;
    }

    public void setDescription(String description) {
        this.description = description;
    }

    public void setPubDate(String pubDate) {
        this.pubDate = pubDate;
    }

    public void setGuid(String guid) {
        this.guid = guid;
    }

    public String getTitle() {
        return title;
    }
    private String title;
    private String link;
    private String description;
    private String pubDate;
    private String guid;
}

Output:
is File
Element Node: rss
    Attribute Key: version Value: 2.0
Element Node: channel
Title: Liftoff News
Link: http://liftoff.msfc.nasa.gov/
Description: Liftoff to Space Exploration.
item text: 
RSS Item title: 
item text: 
RSS Item title: 
item text: 
RSS Item title: 
item text: 
RSS Item title: 
Element Node: title
Element Node: link
Element Node: description
Element Node: language
Element Node: pubDate
Element Node: lastBuildDate
Element Node: docs
Element Node: generator
Element Node: managingEditor
Element Node: webMaster
Element Node: item
Title: Star City
Link: http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp
Description: How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture, language and protocol at Russia's <a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm">Star City</a>.
Element Node: item
Description: Sky watchers in Europe, Asia, and parts of Alaska and Canada will experience a <a href="http://science.nasa.gov/headlines/y2003/30may_solareclipse.htm">partial eclipse of the Sun</a> on Saturday, May 31st.
Element Node: item
Title: The Engine That Does More
Link: http://liftoff.msfc.nasa.gov/news/2003/news-VASIMR.asp
Description: Before man travels to Mars, NASA hopes to design new engines that will let us fly through the Solar System more quickly. The proposed VASIMR engine would do that.
Element Node: item
Title: Astronauts' Dirty Laundry
Link: http://liftoff.msfc.nasa.gov/news/2003/news-laundry.asp
Description: Compared to earlier spacecraft, the International Space Station has many luxuries, but laundry facilities are not one of them. Instead, astronauts have other options.
Element Node: title
Element Node: link
Element Node: description
Element Node: pubDate
Element Node: guid
Element Node: description
Element Node: pubDate
Element Node: guid
Element Node: title
Element Node: link
Element Node: description
Element Node: pubDate
Element Node: guid
Element Node: title
Element Node: link
Element Node: description
Element Node: pubDate
Element Node: guid

XML Code:
    <?xml version="1.0"?> 
    <rss version="2.0"> 
    <channel> 
    <title>Liftoff News</title> 
    <link>http://liftoff.msfc.nasa.gov/</link> 
    <description>Liftoff to Space Exploration.</description> 
    <language>en-us</language> 
    <pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
     <lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate> 
     <docs>http://blogs.law.harvard.edu/tech/rss</docs> 
     <generator>Weblog Editor 2.0</generator> 
     <managingEditor>editor@example.com</managingEditor>
      <webMaster>webmaster@example.com</webMaster>
       <item> 
       <title>Star City</title> 
       <link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link> 
       <description>How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture, language and protocol at Russia's &lt;a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm"&gt;Star City&lt;/a&gt;.</description>
        <pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate> 
        <guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
         </item> 
         <item> 
         <description>Sky watchers in Europe, Asia, and parts of Alaska and Canada will experience a &lt;a href="http://science.nasa.gov/headlines/y2003/30may_solareclipse.htm"&gt;partial eclipse of the Sun&lt;/a&gt; on Saturday, May 31st.</description>
          <pubDate>Fri, 30 May 2003 11:06:42 GMT</pubDate> 
          <guid>http://liftoff.msfc.nasa.gov/2003/05/30.html#item572</guid>
           </item> <item> <title>The Engine That Does More</title> 
           <link>http://liftoff.msfc.nasa.gov/news/2003/news-VASIMR.asp</link> 
           <description>Before man travels to Mars, NASA hopes to design new engines that will let us fly through the Solar System more quickly. The proposed VASIMR engine would do that.</description>
            <pubDate>Tue, 27 May 2003 08:37:32 GMT</pubDate> 
            <guid>http://liftoff.msfc.nasa.gov/2003/05/27.html#item571</guid> 
            </item> <item> <title>Astronauts' Dirty Laundry</title>
             <link>http://liftoff.msfc.nasa.gov/news/2003/news-laundry.asp</link> 
             <description>Compared to earlier spacecraft, the International Space Station has many luxuries, but laundry facilities are not one of them. Instead, astronauts have other options.</description> <pubDate>Tue, 20 May 2003 08:56:02 GMT</pubDate> 
             <guid>http://liftoff.msfc.nasa.gov/2003/05/20.html#item570</guid> 
             </item>
              </channel> 
              </rss>

最佳答案

对于 RSS,您可以使用更具体的 API - Rome 。和here是一篇关于如何使用它的文章。

上面的 getFirstChild() 是必需的,因为您的 Element 不包含文本 - 它包含一个 Text 节点,而该节点又包含文本。

关于java - 帮助用 Java 构建 RSS 阅读器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3689845/

相关文章:

c# - C# 中的 XML 序列化数组

javascript - 从网站检索 SVG 图像

java - Eclipse中Java编写的语句生成返回数据类型的快捷键是什么?

java - 安卓 keystore : "Keystore was tampered with, or password was incorrect."

java - 如何从 Recycler View onClick 缓存 WebView Activity 的多个 URL

Java如何格式化字符串?

c# - 未生成 MyAssembly.XmlSerializers.dll

Android 多语言和少数语言支持没有区域设置的语言

javascript - 如何使用 JavaScript 创建文档对象

javascript - HTML/Javascript 表单显示的值不正确