Retrieve song lyrics in Java using Screenscraping with JSoup

Last year I wrote about JSoup, a Java library that helps with screenscraping: Screenscraping from Java using jsoup – effective data gathering from websites ( Last month I had another opportunity for using JSoup, this time to gather song lyrics for the songs on a CD. The context in this case was the internal SOA for Java Professionals training program at AMIS. The students did an assignment to complete the second block in this three-piece program. Their assignment required them to implement a Web Service that produced the CD Booklet for a certain CD – returned as PDF document with illustration, song titles and song lyrics. One of the resources we made available to the students was a Java Class that returned song lyrics. It was their challenge to integrate this class in a proper way in their application (be it PL/SQL, SOA Suite 11g or OSB based).

The LyricsGatherer is easily constructed using JSoup and the website (that suffers from periodic and unfortunate loss of service) :


Downdrilling on the search results brings us to the actual song lyrics:



And if a browser can do this, so can a Java program (generally speaking and definitely true in this case).

The Java code – leveraging JSoup – to retrieve song lyrics looks like this:



import java.util.ArrayList;
import java.util.List;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.nodes.Node;
import org.jsoup.nodes.TextNode;

public class LyricsGatherer {

   private final static String songLyricsURL = "";

   public static List<String> getSongLyrics( String band, String songTitle) throws IOException {
     List<String> lyrics= new ArrayList<String>();

     Document doc = Jsoup.connect(songLyricsURL+ "/"+band.replace(" ", "-").toLowerCase()+"/"+songTitle.replace(" ", "-").toLowerCase()+"-lyrics/").get();
     String title = doc.title();
     Element p ="p.songLyricsV14").get(0);
      for (Node e: p.childNodes()) {
          if (e instanceof TextNode) {
     return lyrics;

   public static void main(String[] args) throws IOException {
      System.out.println(LyricsGatherer.getSongLyrics("U2", "With or Without You"));
      System.out.println(LyricsGatherer.getSongLyrics("Billy Joel", "Allentown"));
      System.out.println(LyricsGatherer.getSongLyrics("Tori Amos", "Winter"));

The results


can easily be returned in a Web Service style fashion.


Download the source discussed in this .