Parsing RSS and Atom with ROME, easier is impossible
is a java library to do things with RSS and Atom feeds. Things can be
reading, writing, merging and converting. Because RSS and Atom feeds are a
standard there is one standard Object where every entry in the feed is converted
to (SyndEntry). In this article I’ll describe how easy it is to parse an RSS
feed with
Some people probably have written their
own simple RSS parser, but believe me, this one is easier and can do more.


Parsing an RSS feed shouldn’t be too difficult. The last
time I did it was because I wanted to
get the RSS feed from Apple to see what new movie trailers they had to offer. I
included a dependency to JDOM 1.0, Jaxen and Saxpath in Maven2 and made a
Trailer object (which is a simple bean with variables for title, link,
description and date).
Then I build this piece of code:
ArrayList<Trailer>
list=new ArrayList<Trailer>();
Document doc = builder.build(new
URL("http://images.apple.com/trailers/rss/newtrailers.rss"));
XPath xpath = XPath.newInstance("//item");
List<Element> results = xpath.selectNodes(doc);
for(Element
e:results){
Trailer t=
Trailer();
t.setTitle(e.getChild("title").getText());
t.setLink(e.getChild("link").getText());
t.setDescription(e.getChild("description").getText().replaceAll("\n",""));
String
dateString=e.getChild("pubDate").getText();
t.setDate(Utils.parseDate(dateString));
list.add(t);
}
had to populate the Trailer object and use XPath to get the items in the feed.
I was positive that someone already did this and made a nice wrapper for it. I
started searching and found
(https://rome.dev.java.net/)
Parsing with ROME
my screen as soon as possible and understand the details later. So that’s what
I’ll show you.
Start a new project in your favorite IDE and include the
jars for JDOM and
download the files or which dependency to include for Maven2).
Now start a new JUnit test or make a simple application and
put the following snippet somewhere where it will be executed:
SyndFeedInput
sfi=new SyndFeedInput();
URL url=new URL("http://images.apple.com/trailers/rss/newtrailers.rss");
SyndFeed feed
= sfi.build(new XmlReader(url));
List<SyndEntry>
entries = feed.getEntries();
for (SyndEntry entry:entries){
System.out.println(entry.getTitle());
}
how little lines of code they need it would look like this:
List<SyndEntry>
entries = new SyndFeedInput().build(new XmlReader(new
URL("http://images.apple.com/trailers/rss/newtrailers.rss"))).getEntries();
list!
Details
all types of RSS and Atom 0.3 feeds. The input of a SyndFeedInput can be a W3C
xml Document, JDOM Document, File, Sax inputsource or a java.io.Reader.
documentation. This voodoo is trying to figure out the character set of the
xml. With my old parser I didn’t take into account that my streams could have
strange characters. If I wanted to subscribe to this Japanese feed I ran into
trouble. No output at all! With the Voodoo of ROME I didn’t have to worry about it
anymore:

characters are right. When you also get question marks the first time you
probably forgot to set the outputtype to UTF-8. In Windows you also have some trouble getting the characters
right, the solution is creating a servlet with jsp that writes html to you
browser. Include <%@
page contentType="text/html; charset=UTF-8" %> in your jsp
and it should work.
Another nice detail is that the publishing date is converted to a java.util.date Object.
But wait, there is more…
only thing
can do. It’s also possible to convert one format to the other, create your own
feeds, merge feeds and output a feed to an outputstream (like a file or
servlet)
wiki is very good. There are some nice tutorials and links to articles about
projects like these the documentation is really bad and you have to figure out
a lot yourself, but with
even the Javadoc is quite elaborate.
You know there exists something like
know.
Files needed
Or a Maven2 dependency:
<dependency>
<groupId>
<artifactId>
<version>0.8</version>
</dependency>
Sources
ROME in a Day: Parse and Publish Feeds in Java (an article on xml.com)
Hi,
I am kind of new to RSS, ROMe and Maven2.
I already set up maven 2 but to use Rome only there is an old document related to maven 1 on the java.net.
I copied the rome dependency in pom.xml and put rome-0.9.jar and jdom.jar in WEB_INF/lib.
Now to parse any rss feed would you tell me where I should put this snippet:
SyndFeedInput sfi=new SyndFeedInput();
URL url=new URL(“http://images.apple.com/trailers/rss/newtrailers.rss”);
SyndFeed feed = sfi.build(new XmlReader(url));
List entries = feed.getEntries();
for (SyndEntry entry:entries){
System.out.println(entry.getTitle());
}
Should it be in a java file or in jsp?
Thanks in advance