A quite common concept in many programming langiages, though perhaps strongest in Java, is the String Tokenizer. This is a utility class or function that takes a String as input, and optionally one or more delimiters, and breaks up the string in individual elements, returning those as an appropriate collection. As an example: a String Tokenizer could split the string “red,blue,yellow” into a collection of the strings red, blue and yellow – if the comma had been specified as delimiter or no delimiter had been defined and the comma was the default. With the “e” as delimiter, the String Tokenizer would have produced: “r”, “d,blu”,”,y” and “llow”.
PL/SQL has something akin to the String Tokenizer, in the DBMS_UTILITY.COMMA_TO_TABLE function. By the way, my colleague Alex discussed this function in quite a bit of detail in his recent article Never Assume, Always Test.
While preparing for our session “SQL Puzzles” (next Tuesday, 13th February) I needed a String Tokenizer in SQL, ideally nothing but SQL. Again, with a different objective in mind, I wrote a SQL Query that indeed acts as a Tokenizer. I am quite sure it can be done much better – Anton: please share with me your thoughts on this – but I will share my attempt with you anyway.
The query is this one:
with numbers as ( select rownum rn from ( select count(*) rn from dual group by cube(1,2,3,4) ) ) , input as ( select 'red,blue,yellow' string_to_tokenize , 'e' delimiter from dual ) , positions as ( select rn pos from numbers join input on instr(string_to_tokenize, delimiter, 1, rn) > 0 union select 0 pos from input where instr(string_to_tokenize, delimiter) > 0 ) select substr( string_to_tokenize , 1 + case pos when 0 then 0 else instr(string_to_tokenize, delimiter, 1, pos) end , case instr(string_to_tokenize, delimiter, 1, pos+1) when 0 then length(string_to_tokenize)+2 else instr(string_to_tokenize, delimiter, 1, pos+1)+1 end - ( 2 + case pos when 0 then 0 else instr(string_to_tokenize, delimiter, 1, pos) end)) token from positions , input
When I run the query on , the results are:
TOKEN ------------------ red blue yellow
And when I use “e” as the delimiter, the results are:
TOKEN ------------- r d,blu ,y llow
Note that the inline-view numbers is somewhat limited as right now it will return no more than 16 rows. There are of course several ways to increase or tune that number, one would be to have numbers adapt to the inline-view input, that should probably be defined first in the query.
Also note that this query could be run using SQL*Plus parameters, like this:
SQL> with numbers as 2 ( select rownum rn 3 from ( select count(*) rn 4 from dual 5 group 6 by cube(1,2,3,4) 7 ) 8 ) 9 , input as 10 ( select '&String_To_Tokenizer' string_to_tokenize 11 , '&Delimiter' delimiter 12 from dual 13 ) 14 , positions as 15 ( select rn pos 16 from numbers 17 join 18 input 19 on instr(string_to_tokenize, delimiter, 1, rn) > 0 20 union 21 select 0 pos 22 from input 23 where instr(string_to_tokenize, delimiter) > 0 24 ) 25 select substr( string_to_tokenize 26 , 1 + case pos when 0 then 0 else instr(string_to_tokenize, delimiter, 1, pos) end 27 , case instr(string_to_tokenize, delimiter, 1, pos+1) when 0 then length(string_to_toke nize)+2 else instr(string_to_tokenize, delimiter, 1, pos+1)+1 end 28 - ( 2 + case pos when 0 then 0 else instr(string_to_tokenize, delimiter, 1, pos) end) ) token 29 from positions 30 , input 31 / Enter value for string_to_tokenizer: in the land of the pigs, the butcher is king (by Meat Loaf) old 10: ( select '&String_To_Tokenizer' string_to_tokenize new 10: ( select 'in the land of the pigs, the butcher is king (by Meat Loaf)' string_to_tokenize Enter value for delimiter: old 11: , '&Delimiter' delimiter new 11: , ' ' delimiter TOKEN -------------------------------------------------------------------------------- in the land of the pigs, the butcher is king (by TOKEN -------------------------------------------------------------------------------- Meat Loaf) 13 rows selected.
Finally, the query should be extended to support multiple delimiters.
SELECT SUBSTR(X,Y + 1, Z – 1 – Y)
FROM (
SELECT X, INSTR(X,DELIMITER,1,ROWNUM) Y, INSTR(X,DELIMITER,1,ROWNUM+1) Z
FROM (SELECT ‘,’ || ‘A,BB,CC’ || ‘,’ X, ‘,’ DELIMITER FROM DUAL), ALL_TABLES
WHERE INSTR(X,DELIMITER,1,ROWNUM+1) != 0
)
hi guyz i tried to develop a number puzzle in forms6i got strucked in middle .. can any one help me out ..
declare
temp varchar2(40);
lbl varchar2(40);
begin
lbl:=get_item_property(name_in(‘system.trigger_item’),label);
temp:= :system.trigger_item;
if get_item_property(temp,VISIBLE)= ‘TRUE’ then
set_item_property(:parameter.HIDENUM,VISIBLE,PROPERTY_TRUE);
set_item_property(:parameter.HIDENUM,ENABLED,PROPERTY_TRUE);
set_item_property(:parameter.HIDENUM,label,lbl);
go_item(:parameter.HIDENUM);
set_item_property(temp,VISIBLE,PROPERTY_FALSE);
:parameter.HIDENUM:=temp;
end if;
/*if get_item_property(‘number_game.TWO’,displayed)=’TRUE’ then
set_item_property(‘number_game.TWO’,displayed,’FALSE’);
end if;
if get_item_property(‘number_game.THREE’,displayed)=’TRUE’ then
set_item_property(‘number_game.THREE’,displayed,’FALSE’);
end if;*/
if lbl =’EXIT’ then exit_form(no_validate);
end if;
end;
Hi,
I found and already used a nice solution from O’Reillys "SQL Cookbook".
Those days I needed to convert delimited data into a multi-valued IN-List.
I just changed the query a bit to work as a string tokenizer. See:
SELECT RTRIM(
SUBSTR( x.token
, INSTR(x.token, ‘&Delimiter’, 1, iter.pos) + 1
,INSTR(x.token, ‘&Delimiter’, 1, iter.pos + 1) – INSTR(x.token, ‘&Delimiter’, 1, iter.pos)
)
, ‘&Delimiter’
) token
, iter.pos AS Position
FROM (SELECT ‘&Delimiter’ || ‘&StringToTokenize’ || ‘&Delimiter’ token
FROM dual
) x
, (SELECT rownum AS pos
FROM all_objects
) iter
WHERE iter.pos
(Sorry,the PRE tags got stripped)
My own variant using the MODEL clause
SELECT var_list, var_list2
FROM ( select ‘red,blue,yellow’ var_list,’,’ delim from dual )
MODEL
PARTITION BY (var_list, delim)
DIMENSION BY (0 AS z)
MEASURES (var_list var_list2,
0 vcnt, 0 as vstart, 0 as vend)
RULES ITERATE (4000) UNTIL (ITERATION_NUMBER+1 = vcnt[0] )
(vcnt[ITERATION_NUMBER]=
nvl(length(cv(var_list)) – length(replace(cv(var_list),cv(delim))) + 1,1),
vstart[ITERATION_NUMBER] =
case when ITERATION_NUMBER = 0 then 1
else instr(cv(var_list),cv(delim),1,ITERATION_NUMBER)+1 end,
vend[ITERATION_NUMBER] =
case when nvl(instr(cv(var_list),cv(delim),1,ITERATION_NUMBER+1),0) = 0
then nvl(length(cv(var_list))+1,1)
else instr(cv(var_list),cv(delim),1,ITERATION_NUMBER+1) end,
var_list2[ITERATION_NUMBER] =
substr(cv(var_list),vstart[ITERATION_NUMBER],
vend[ITERATION_NUMBER] – vstart[ITERATION_NUMBER])
)
order by var_list, vstart;
Gentlemen, you have started your engines! Two beatiful solutions, that surpass my original proposal. Thanks for sharing your ideas!
Lucas
My thoughts about this (only because you asked for it 🙂 ):
For the thoughts of some other people on something simulare, see Adrian’s Oracle Page at http://mysite.wanadoo-members.co.uk/adriansoraclepages/demos/10g/scripts/string_to_table_sql_only.sql
Anton
BTW, is there a way to format code in a comment? Because all the line brakes are gone. 🙁
Hi Lucas,
you are keeping me away from work! But a SQL puzzle in the morning is good for my brain! 🙂
Here is my alternate solution using CONNECT BY
Patrick
Check out my APEX blog: http://inside-apex.blogspot.com/