An Oracle10g PL/SQL implementation of the AWK split() built-in

0

With the regular expression functionality in Orcale database 10g some handy build-ins of the UNIX pattern matching language AWK are easily implemented. I myself use AWK split() function quite often in AWK scripts. AWK split() splits a given string into chunks separated by substrings which are described by a regular expression parameter. The substrings are returned in hash table (an associative array like datatype). The function returns the number of chunks collected.

A typical split call could be something like  t = split(‘oracle123@mysql’,tab,’^a-z +’) putting  ‘oracle’ and ‘mysql’ in the hash table tab and returning t = 2.....

The following piece of code is for my own little recursive PL/SQL split procedure with some added functionality. The first 3 parameters do what their AWK counterparts do. The parameter "mode" if set to < 0 makes split collect substrings described by regexp into tab rather than filtering out undesirable ones. "ml" tells split the length a chunk should have in order te be potentially interesting.

I wrote jsplit() (my version of split as a procedure, instead of as a function, mainly because the number of chunks returned can easily be queried using the ".count" method of the tab variable if necessary.

create or replace type chartab_t as table of varchar2(4000);

The procedure itself:

CREATE OR REPLACE<br />procedure jsplit<br />(str IN varchar2,tab IN OUT chartab_t,fs varchar2,md IN number default 1, ml IN number default 2)&nbsp; is<br />&nbsp; t&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; number;<br />&nbsp; v_sub&nbsp;&nbsp; varchar2(4000);<br />&nbsp; v_str&nbsp;&nbsp; varchar2(4000) := str;<br />begin<br />&nbsp; if str is null then<br />&nbsp;&nbsp;&nbsp; return;<br />&nbsp; end if;<br />&nbsp; t := regexp_instr(v_str,fs);<br />&nbsp; if md &gt; 0 THEN –-&nbsp; AWK split(); t == first index fieldsep; <br />&nbsp;&nbsp;&nbsp; case t<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; when 0 then — no fieldsep<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (length(v_str) &gt;= ml) then<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tab.extend;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tab(tab.last) := v_str;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; v_str := ‘’;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; jsplit(v_str,tab,fs,md,ml);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; v_str := ‘’;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; jsplit(v_str,tab,fs,md,ml);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; end if;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; when 1 then –<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; v_str := regexp_replace(v_str,fs,'’,1,1,’i');<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; jsplit(v_str,tab,fs,md,ml);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else — t &gt; 1<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; v_sub := substr(v_str,1,t-1);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (length(v_sub) &gt;= ml) then<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tab.extend;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tab(tab.last) := v_sub;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; end if;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; v_str := substr(v_str,t);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; jsplit(v_str,tab,fs,md,ml);<br />&nbsp;&nbsp;&nbsp; end case;<br />&nbsp; else –&nbsp; mode &lt; 0<br />&nbsp;&nbsp;&nbsp; case t<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; when 0 then<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; when 1 then<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; v_sub := regexp_substr(v_str,fs);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if length(v_sub) &gt;= ml then<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tab.extend;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tab(tab.last) := v_sub;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; end if;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; v_str := regexp_replace(v_str,fs,'’,1,1,’i');<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; jsplit(v_str,tab,fs,md,ml);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else — t &gt; 1<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; v_str := substr(v_str,t);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; jsplit(v_str,tab,fs,md,ml);<br />&nbsp;&nbsp;&nbsp; end case;<br />&nbsp; end if;<br />end;
&nbsp;
Share.

About Author

Comments are closed.