Querying ISO Country names and codes from SQL - exposing standard domains based on dynamic internet retrieval 20188367001

Querying ISO Country names and codes from SQL – exposing standard domains based on dynamic internet retrieval

The other day, I came across a press release – www.iso.org/iso/pressrelease.htm?refid=Ref1383 – from the ISO organization, announcing that they had issued new country codes for the islands in the Dutch West Indies to reflect their new administrative status following constitutional reform by the government of the Netherlands. It struck me that in many applications and for many services and data exchanges, we have a need for country codes. And that we may consider these codes to be constant, unchangeable over time. And that this turns out to be not true.

The ISO publishes lists of standard, publicly agreed on identifications for various things, including monetary units (currencies), country codes, language identifications, paper sizes, etc. When we have need for such code values in our services and applications, we can enlist the help of the ISO codings.

This article demonstrates how we can take the country codes published by ISO at http://www.iso.org/iso/list-en1-semic-3.txt and use it to expose a SQL View COUNTRIES in our Oracle Database. The HTTPURITYPE function is used to retrieve the country codes into our database.

When we are done setting up the country codes, we will established a view that can be queried as easily as:

The steps to get there are:

  • (as SYS) allow the database user to go out to the ISO’s web site at http://www.iso.org using UTL_HTTP
  • create a type (COUNTRY_T) and a collection type (COUNTRY_TABLE_T)
  • create a package specification (COUNTRIES) with a function get_countries that returns a COUNTRY_TABLE_T
  • create a view that returns records produced by the function get_countries in package countries
  • finally: implement the package body for package countries

Grant ‘website access privileges’ to database user

As a first, preparatory step, we will have SYS grant the privilege to user SCOTT to access the website of the ISO. The following snippet of PL/SQL code takes care of this:

begin
  dbms_network_acl_admin.create_acl
  ( acl => 'IsoCountriesList.xml',
    description => 'Countries list on www.iso.org',
    principal => 'SCOTT',
    is_grant => true,
    privilege => 'connect',
    start_date => null,
    end_date => null
  );
DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE
  ( acl => 'IsoCountriesList.xml',
    principal => 'SCOTT',
    is_grant => true,
    privilege => 'resolve'
  );
  DBMS_NETWORK_ACL_ADMIN.ASSIGN_ACL
  ( acl => 'IsoCountriesList.xml',
    host => 'www.iso.org'
  );
end;

Create the Type and Collection Type

Countries will be instantiatied using a special type – country_t – and a collection of countries is created and cached for the duration of the session using a table of country_t, through the collection type COUNTRY_TABLE_T

create type country_t
as object
( name varchar2(200)
, code varchar2(2)
)

create type countries_table_t
as table of country_t

Create a package specification countries

A package called COUNTRIES is created. This package will contain the PL/SQL code for reaching out to the ISO website to retrieve the country codes and to retain the country codes in memory for subsequent access. A function get_countries returns a COUNTRY_TABLE_T that the view can wrap using the TABLE operator.

The package specification is created as follows:

create or replace
package countries
is

function get_countries
return countries_table_t
;

end;

Create the COUNTRIES_VW view

The view COUNTRIES_VW can be accessed like any other VIEW or TABLE for that matter. It will return country records – code and name – that are produced by the function get_countries in package countries, that in turn retrieves them from the ISO website. These country records are not persisted in the database, they are produced out of thin air (‘the cloud’).

create or replace
view countries_vw
as
select code
,      name
from   table(countries.get_countries)

Implement the package body for COUNTRIES

The real work has to be done somewhere. That somewhere is the package body for COUNTRIES. During initialization – that is: the first time the package is accessed during the lifetime of a database session – the countries are read from the ISO web site, loaded into a memory and subsequently returned from that cached collection.

create or replace
package body countries
is

g_countries countries_table_t;

procedure initialize_countries
is
  l_clob clob;
  l_pos  integer;
  l_sc_pos  integer;
  l_country country_t;
  l_end_of_line varchar(2):= chr(13)||chr(10);
begin
  g_countries:= countries_table_t();

  l_clob:= httpuritype('http://www.iso.org/iso/list-en1-semic-3.txt').getClob();

  -- start processing with AFGHANISTAN
  l_pos:= instr( l_clob, 'AFGHANISTAN');

  -- find country (between l_pos and chr(13) - while l_pos < length(l_clob) there should be a semi-colon between the name and the code of a country
  loop
    l_sc_pos := instr( substr(l_clob, l_pos), ';');
    exit when l_sc_pos = 0;
    l_country := country_t
                 ( substr(l_clob, l_pos, l_sc_pos-1)
                 , substr(l_clob, l_pos + l_sc_pos , 2)
                 );
    g_countries.extend;
    g_countries( g_countries.last):= l_country;
    l_pos:= l_pos + instr( substr(l_clob, l_pos), l_end_of_line) + 1;
  end loop;

end initialize_countries;


function get_countries
return countries_table_t
is
begin
  return g_countries;
end get_countries;


begin
  initialize_countries;

end countries;

Querying the COUNTRIES_VW view

A simple query to find the country names and codes for the new islands in the Dutch West Indies would look like:

select code, name
from   countries_vw
where  code in ('SX','CW','BQ')

The result shows the three countries.

CODE         NAME
------------ ------------------------------------
BQ           BONAIRE, SAINT EUSTATIUS AND SABA
CW           CURAÇAO
SX           SINT MAARTEN (DUTCH PART)

At the same time, the country code AN – used previously for the Netherlands Antilles – has been removed from the list:

select code, name
from   countries_vw
where  code ='AN'

Note: the first query in a session from view COUNTRIES_VW can be quite slow: the results have not been cached yet in the package’s global variable g_countries. The second and subsequent calls to countries.get_countries will typically much faster (factor 10 or more) than that initial query.

Leveraging the Oracle 11g Result Cache

The countries package as implemented above does its own caching. Unfortunately, apart from the fact that we had to write code to achieve that caching functionality, all we get is per session caching. That means that every new database session will again go out to the ISO website to get hold of the country records. And that every session has its own copy of the data, which is really a waste of memory resources.

Starting with Oracle RDBMS 11gR1, we do no longer need to implement our own caching solution in many situations. Instead, we can leverage the database native feature of Function Result Cache. This feature gives us two huge benefits: it is declarative – no code required – and it will cache the data across sessions! The latter means that only one session will take the hit of retrieving country details from the ISO site and only one set of records is stored in the entire database. Also see: http://www.oracle-base.com/articles/11g/CrossSessionPlsqlFunctionResultCache_11gR1.php

Unfortunately, we are not helped with our current challenge, as for now the RESULT_CACHE feature cannot be used for functions that return an OBJECT TYPE or COLLECTION. So for now, we are stuck with the custom DIY cache.

Resources

Download sources for this article: country_querying.txt

One Response

  1. Chris Neumueller December 29, 2010