Screen scraping using Google Documents in a minute or less…

Jorrit Nijssen
0 0
Read Time:35 Second

In a previous blog Lucas used JSoup to collect data from a web page. In this post I’ll show a declarative way to screen scrape data with the help of Google Documents.

The following webpage http://www.databaseolympics.com/games/gamesyear.htm?g=26 contains the olympic data I would like to import

  1. Open a new Google spreadsheet document.
  2. Paste the following formula in a cell A1
    =ImportHtml("http://www.databaseolympics.com/games/gamesyear.htm?g=26";"table"; 3)
  3. Press enter 🙂

The importHtml function instructs Google Documents to retrieve the third table on the webpage. There are other import functions as well see http://docs.google.com/support/bin/answer.py?answer=75507 for more information.

Spreadsheet after screenscraping

About Post Author

Jorrit Nijssen

Jorrit is a senior Oracle consultant.
Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %
Next Post

Shortly to follow - a book review - Oracle WebCenter 11g PS3 Administration Cookbook by Yannick Ongena (Packt Publishing, 2011)

Hot off the press (well, that is what you always will have with printing on demand I suppose) I received an electronic copy (not off the press after all) of Oracle WebCenter 11g PS3 Administration Cookbook by Yannick Ongena (Packt Publishing, 2011). WebCenter has been one of my favorite Oracle […]
%d bloggers like this: