Posts tagged screen scraping
Last year I wrote about JSoup, a Java library that helps with screenscraping: Screenscraping from Java using jsoup â€“ effective data gathering from websites (http://technology.amis.nl/blog/13121/screenscraping-from-java-using-jsoup-effective-data-gathering-from-websites). Last month I had another opportunity for using JSoup, this time to gather song lyrics for the songs on a CD. The context in this case was the internal SOA for Java Professionals training program at AMIS. The students did an assignment to complete the second block in this three-piece program. Their assignment required them to implement a Web Service that produced the CD Booklet for a certain CD – returned as PDF document with illustration, song titles and song lyrics. One of the resources we made available to the students was a Java Class that returned song lyrics. It was their challenge to integrate this class in a proper way in their application (be it PL/SQL, SOA Suite 11g or OSB based).
The LyricsGatherer is easily constructed using JSoup and the website http://www.songlyrics.com/ (that suffers from periodic and unfortunate loss of service) :