Screen scraping using Google Documents in a minute or less…

In a previous blog Lucas used JSoup to collect data from a web page. In this post I’ll show a declarative way to screen scrape data with the help of Google Documents.

The following webpage contains the olympic data I would like to import

  1. Open a new Google spreadsheet document.
  2. Paste the following formula in a cell A1
    =ImportHtml("";"table"; 3)
  3. Press enter 🙂

The importHtml function instructs Google Documents to retrieve the third table on the webpage. There are other import functions as well see for more information.

Spreadsheet after screenscraping