Querying ISO Country names and codes from SQL – exposing standard domains based on dynamic internet retrieval
The other day, I came across a press release – www.iso.org/iso/pressrelease.htm?refid=Ref1383 – from the ISO organization, announcing that they had issued new country codes for the islands in the Dutch West Indies to reflect their new administrative status following constitutional reform by the government of the Netherlands. It struck me that in many applications and for many services and data exchanges, we have a need for country codes. And that we may consider these codes to be constant, unchangeable over time. And that this turns out to be not true.
The ISO publishes lists of standard, publicly agreed on identifications for various things, including monetary units (currencies), country codes, language identifications, paper sizes, etc. When we have need for such code values in our services and applications, we can enlist the help of the ISO codings.
This article demonstrates how we can take the country codes published by ISO at http://www.iso.org/iso/list-en1-semic-3.txt and use it to expose a SQL View COUNTRIES in our Oracle Database. The HTTPURITYPE function is used to retrieve the country codes into our database.
When we are done setting up the country codes, we will established a view that can be queried as easily as:
The steps to get there are:
- (as SYS) allow the database user to go out to the ISO’s web site at http://www.iso.org using UTL_HTTP
- create a type (COUNTRY_T) and a collection type (COUNTRY_TABLE_T)
- create a package specification (COUNTRIES) with a function get_countries that returns a COUNTRY_TABLE_T
- create a view that returns records produced by the function get_countries in package countries
- finally: implement the package body for package countries
Grant ‘website access privileges’ to database user
As a first, preparatory step, we will have SYS grant the privilege to user SCOTT to access the website of the ISO. The following snippet of PL/SQL code takes care of this:
begin dbms_network_acl_admin.create_acl ( acl => 'IsoCountriesList.xml', description => 'Countries list on www.iso.org', principal => 'SCOTT', is_grant => true, privilege => 'connect', start_date => null, end_date => null ); DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE ( acl => 'IsoCountriesList.xml', principal => 'SCOTT', is_grant => true, privilege => 'resolve' ); DBMS_NETWORK_ACL_ADMIN.ASSIGN_ACL ( acl => 'IsoCountriesList.xml', host => 'www.iso.org' ); end;
Create the Type and Collection Type
Countries will be instantiatied using a special type – country_t – and a collection of countries is created and cached for the duration of the session using a table of country_t, through the collection type COUNTRY_TABLE_T
create type country_t as object ( name varchar2(200) , code varchar2(2) ) create type countries_table_t as table of country_t
Create a package specification countries
A package called COUNTRIES is created. This package will contain the PL/SQL code for reaching out to the ISO website to retrieve the country codes and to retain the country codes in memory for subsequent access. A function get_countries returns a COUNTRY_TABLE_T that the view can wrap using the TABLE operator.
The package specification is created as follows:
create or replace package countries is function get_countries return countries_table_t ; end;
Create the COUNTRIES_VW view
The view COUNTRIES_VW can be accessed like any other VIEW or TABLE for that matter. It will return country records – code and name – that are produced by the function get_countries in package countries, that in turn retrieves them from the ISO website. These country records are not persisted in the database, they are produced out of thin air (‘the cloud’).
create or replace view countries_vw as select code , name from table(countries.get_countries)
Implement the package body for COUNTRIES
The real work has to be done somewhere. That somewhere is the package body for COUNTRIES. During initialization – that is: the first time the package is accessed during the lifetime of a database session – the countries are read from the ISO web site, loaded into a memory and subsequently returned from that cached collection.
create or replace package body countries is g_countries countries_table_t; procedure initialize_countries is l_clob clob; l_pos integer; l_sc_pos integer; l_country country_t; l_end_of_line varchar(2):= chr(13)||chr(10); begin g_countries:= countries_table_t(); l_clob:= httpuritype('http://www.iso.org/iso/list-en1-semic-3.txt').getClob(); -- start processing with AFGHANISTAN l_pos:= instr( l_clob, 'AFGHANISTAN'); -- find country (between l_pos and chr(13) - while l_pos < length(l_clob) there should be a semi-colon between the name and the code of a country loop l_sc_pos := instr( substr(l_clob, l_pos), ';'); exit when l_sc_pos = 0; l_country := country_t ( substr(l_clob, l_pos, l_sc_pos-1) , substr(l_clob, l_pos + l_sc_pos , 2) ); g_countries.extend; g_countries( g_countries.last):= l_country; l_pos:= l_pos + instr( substr(l_clob, l_pos), l_end_of_line) + 1; end loop; end initialize_countries; function get_countries return countries_table_t is begin return g_countries; end get_countries; begin initialize_countries; end countries;
Querying the COUNTRIES_VW view
A simple query to find the country names and codes for the new islands in the Dutch West Indies would look like:
select code, name from countries_vw where code in ('SX','CW','BQ')
The result shows the three countries.
CODE NAME ------------ ------------------------------------ BQ BONAIRE, SAINT EUSTATIUS AND SABA CW CURAÃ‡AO SX SINT MAARTEN (DUTCH PART)
At the same time, the country code AN – used previously for the Netherlands Antilles – has been removed from the list:
select code, name from countries_vw where code ='AN'
Note: the first query in a session from view COUNTRIES_VW can be quite slow: the results have not been cached yet in the package’s global variable g_countries. The second and subsequent calls to countries.get_countries will typically much faster (factor 10 or more) than that initial query.
Leveraging the Oracle 11g Result Cache
The countries package as implemented above does its own caching. Unfortunately, apart from the fact that we had to write code to achieve that caching functionality, all we get is per session caching. That means that every new database session will again go out to the ISO website to get hold of the country records. And that every session has its own copy of the data, which is really a waste of memory resources.
Starting with Oracle RDBMS 11gR1, we do no longer need to implement our own caching solution in many situations. Instead, we can leverage the database native feature of Function Result Cache. This feature gives us two huge benefits: it is declarative – no code required – and it will cache the data across sessions! The latter means that only one session will take the hit of retrieving country details from the ISO site and only one set of records is stored in the entire database. Also see: http://www.oracle-base.com/articles/11g/CrossSessionPlsqlFunctionResultCache_11gR1.php
Unfortunately, we are not helped with our current challenge, as for now the RESULT_CACHE feature cannot be used for functions that return an OBJECT TYPE or COLLECTION. So for now, we are stuck with the custom DIY cache.
Download sources for this article: country_querying.txt
- Poor man's VPD – Virtual Private Database before 8i and in Standard Edition Databases
- PL/SQL post compiler to fix TAPI – dealing with the 'TAPI cannot deal with MERGE' problem
- Dynamic SQL Pivoting – Stealing Anton’s Thunder
- The SQL Query to power Google Suggest
- SELECT * FROM RSS_FEED – querying feeds in SQL using Table Functions and XML parsing