Oracle Stored Procedure Read Parse Delimited File
Hiya,
Since proper noun/value pairs are very similar JSON (but hither using dissimilar delimiters), I got inspired past Stew Ashton using JSON_TABLE to split strings: https://stewashton.wordpress.com/2018/06/05/splitting-strings-a-new-champion/
I'm building a exam table with 10 rows each with a CLOB with 90 (3*xxx) proper noun/value pairs.
Then I use some REPLACE to turn the name=value~ format into JSON.
And then information technology can exist parsed with JSON_TABLE.
create table T ( id integer, c clob ); declare v varchar2(32000); ctemp clob; brainstorm dbms_lob.createtemporary(ctemp,truthful); for j in 1 .. 10 loop dbms_lob.trim(ctemp,0); for i in 1 .. thirty loop five := 'pid'||i||'='||(j*10000000+i)||'~' || 'username'||i||'='||rpad(j||i||'abc',j*10+10*i,'xyz')||'~' || 'attr'||i||'='||rpad(j||i||'xyz',j*5+5*i,'abc')||'~'; dbms_lob.writeappend(ctemp,length(v),v); end loop; insert into T values (j, ctemp); end loop; dbms_lob.freetemporary(ctemp); commit; terminate; / select id, dbms_lob.getlength(c) len from t; ID LEN ---------- ---------- one 8448 2 8898 three 9348 4 9798 5 10248 6 10698 7 11148 8 11598 ix 12048 10 12528 ten rows selected. set linesize 132 set pagesize 50 column id format 99 column j_obj format a100 set long 100 -- Plough into JSON select id, '{"' || replace(replace(supercede(rtrim(c,'~'), '"', '\"'), '=', '":"'), '~', '","') || '"}' j_obj from t; ID J_OBJ --- ---------------------------------------------------------------------------------------------------- one {"pid1":"10000001","username1":"11abcxyzxyzxyzxyzxyz","attr1":"11xyzabcab","pid2":"10000002","userna 2 {"pid1":"20000001","username1":"21abcxyzxyzxyzxyzxyzxyzxyzxyzx","attr1":"21xyzabcabcabca","pid2":"xx iii {"pid1":"30000001","username1":"31abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxy","attr1":"31xyzabcabcabcabc 4 {"pid1":"40000001","username1":"41abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz","attr1":"41xyzab five {"pid1":"50000001","username1":"51abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzx","attr1 6 {"pid1":"60000001","username1":"61abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz vii {"pid1":"70000001","username1":"71abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz 8 {"pid1":"80000001","username1":"81abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz 9 {"pid1":"90000001","username1":"91abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz x {"pid1":"100000001","username1":"101abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzx 10 rows selected. column username1 format a25 column attr1 format a25 cavalcade username30 format a25 column attr30 format a25 -- Utilize in JSON_TABLE (substr just for SQL*Plus output) select t.id , jt.pid1 , substr(jt.username1 , 1, 25) as username1 , substr(jt.attr1 , 1, 25) as attr1 /* etc etc etc */ , jt.pid30 , substr(jt.username30, 1, 25) equally username30 , substr(jt.attr30 , 1, 25) as attr30 from t , json_table( '{"' || supercede(supplant(replace(rtrim(c,'~'), '"', '\"'), '=', '":"'), '~', '","') || '"}' , '$' columns ( pid1 number path '$.pid1' , username1 varchar2(4000) path '$.username1' , attr1 varchar2(4000) path '$.attr1' /* etc etc etc */ , pid30 number path '$.pid30' , username30 varchar2(4000) path '$.username30' , attr30 varchar2(4000) path '$.attr30' ) ) jt ; ID PID1 USERNAME1 ATTR1 PID30 USERNAME30 ATTR30 --- ---------- ------------------------- ------------------------- ---------- ------------------------- ------------------------- ane 10000001 11abcxyzxyzxyzxyzxyz 11xyzabcab 10000030 130abcxyzxyzxyzxyzxyzxyzx 130xyzabcabcabcabcabcabca ii 20000001 21abcxyzxyzxyzxyzxyzxyzxy 21xyzabcabcabca 20000030 230abcxyzxyzxyzxyzxyzxyzx 230xyzabcabcabcabcabcabca 3 30000001 31abcxyzxyzxyzxyzxyzxyzxy 31xyzabcabcabcabcabc 30000030 330abcxyzxyzxyzxyzxyzxyzx 330xyzabcabcabcabcabcabca 4 40000001 41abcxyzxyzxyzxyzxyzxyzxy 41xyzabcabcabcabcabcabcab 40000030 430abcxyzxyzxyzxyzxyzxyzx 430xyzabcabcabcabcabcabca 5 50000001 51abcxyzxyzxyzxyzxyzxyzxy 51xyzabcabcabcabcabcabcab 50000030 530abcxyzxyzxyzxyzxyzxyzx 530xyzabcabcabcabcabcabca 6 60000001 61abcxyzxyzxyzxyzxyzxyzxy 61xyzabcabcabcabcabcabcab 60000030 630abcxyzxyzxyzxyzxyzxyzx 630xyzabcabcabcabcabcabca seven 70000001 71abcxyzxyzxyzxyzxyzxyzxy 71xyzabcabcabcabcabcabcab 70000030 730abcxyzxyzxyzxyzxyzxyzx 730xyzabcabcabcabcabcabca 8 80000001 81abcxyzxyzxyzxyzxyzxyzxy 81xyzabcabcabcabcabcabcab 80000030 830abcxyzxyzxyzxyzxyzxyzx 830xyzabcabcabcabcabcabca 9 90000001 91abcxyzxyzxyzxyzxyzxyzxy 91xyzabcabcabcabcabcabcab 90000030 930abcxyzxyzxyzxyzxyzxyzx 930xyzabcabcabcabcabcabca 10 100000001 101abcxyzxyzxyzxyzxyzxyzx 101xyzabcabcabcabcabcabca 100000030 1030abcxyzxyzxyzxyzxyzxyz 1030xyzabcabcabcabcabcabc 10 rows selected.
Hope that'southward useful ;-)
Cheerio
/Kim
nice stuff.
Thanks to Kim for the reference!
Marc Bleron blogged nearly CSV CLOBs and JSON_TABLE two years agone:
https://odieweblog.wordpress.com/2016/04/22/simple-csv-parsing-using-xmltable-or-json_table/
but he used JSON arrays, not objects.
The OP said "up to 60 attributes" so I suppose some attributes may be missing in some records. I thought I'd emulate that, and also put the attributes in random lodge, merely to emphasize what Kim has already shown. To make things easier for me, I'g going to add newlines later each record.
First, I have a question: I have read that to load a large CLOB, it's better to apply a lob locator for a persistent LOB. Is this code OK? (See the two lines after BEGIN)
create table t(c clob); declare l_clob clob; begin insert into t values(empty_clob()) returning c into l_clob; dbms_lob.open(l_clob, dbms_lob.lob_readwrite); for rec in ( with data every bit ( select level+i north, 'a'||(level+1) || '=' || 'v\'||(level+ane) || '~' str from dual connect by level <= 9 ) , randoms equally ( select level+1 n, floor(dbms_random.value(ii,9.9999)) r from dual connect past level <= 100 ) select 'pid='||(r.n-1)||'~'|| listagg(str) within group(lodge past dbms_random.random) || chr(10) str from information d, randoms r where d.north != r.r group by r.n ) loop dbms_lob.writeappend(l_clob, length(rec.str), rec.str); stop loop; dbms_lob.shut(l_clob); commit; exception when others and then if dbms_lob.isopen(l_clob) = ane then dbms_lob.close(l_clob); cease if; raise; terminate; / set long 200 select substr(c,ane,200) from t; SUBSTR(C,1,200) -------------------------------------------------------------------------------- pid=ane~a5=v\5~a2=five\2~a3=5\3~a10=v\10~a7=v\seven~a8=v\viii~a6=five\six~a4=five\4~a9=v\9~ pid=ii~a6=v\6~a4=v\4~a8=v\8~a3=v\3~a5=five\v~a2=v\ii~a10=five\ten~ pid=iii~a4=five\4~a7=five\vii~a10=v\x~a8=v\8~a5=v\five~a9=v\nine~a2=five\2~ pid=4~a4=v\4
Now, here is a pipelined table function that will break the CLOB down into VARCHAR2-size chunks for processing. I think this will scale much better for big volumes.
create or replace function pipe_clob ( p_clob in clob, p_max_lengthb in integer default 4000, p_rec_term in varchar2 default ' ' ) render sys.odcivarchar2list pipelined authid current_user as l_amount integer; l_offset integer; l_buffer varchar2(4000 CHAR); l_out varchar2(4000 BYTE); l_buff_lengthb integer; l_occurence integer; l_rec_term_length integer := length(p_rec_term); begin l_amount := p_max_lengthb; l_offset := 1; while l_amount = p_max_lengthb loop begin DBMS_LOB.READ ( p_clob, l_amount, l_offset, l_buffer ); exception when no_data_found so l_amount := 0; end; if l_amount > 0 then l_buff_lengthb := p_max_lengthb + 1; l_occurence := 0; while l_buff_lengthb > p_max_lengthb loop l_occurence := l_occurence + 1; l_buff_lengthb := instrb(l_buffer,p_rec_term,-1, l_occurence); end loop; if l_buff_lengthb = 0 then l_buff_lengthb := lengthb(l_buffer); terminate if; l_out := substrb(l_buffer, 1, l_buff_lengthb-l_rec_term_length); piping row(l_out); l_offset := l_offset + length(l_out) + l_rec_term_length; end if; end loop; return; end; /
Subsequently that:
- I call JSON_ARRAY on each clamper to escape characters if necessary.
- then I turn each record into an object, but put all the objects inside an array.
- JSON_TABLE breaks down the array and puts each value in the right column.
with json_data as ( select supervene upon( replace( replace( '[{'||substr(json_array(column_value),2,length(json_array(column_value))-four)||'"}]', '~\n', '"},{"' ), '=', '":"' ), '~', '","' ) jstr from pipe_clob((select c from t), 3300) ) select r.* from json_data j, json_table( j.jstr, '$[*]' columns pid varchar2(9) path '$.pid', a2 varchar2(nine) path '$.a2', a3 varchar2(9) path '$.a3', a4 varchar2(9) path '$.a4', a5 varchar2(9) path '$.a5', a6 varchar2(9) path '$.a6', a7 varchar2(9) path '$.a7', a8 varchar2(9) path '$.a8', a9 varchar2(9) path '$.a9' ) r where rownum <= 10; PID A2 A3 A4 A5 A6 A7 A8 A9 --------- --------- --------- --------- --------- --------- --------- --------- --------- 1 v\2 five\3 5\4 v\five v\6 v\7 five\8 v\9 two v\2 v\3 v\4 v\five v\6 v\eight iii v\2 five\iv v\5 five\vii v\8 five\9 iv v\2 v\3 5\four v\5 v\6 v\7 v\8 five\9 5 v\2 v\3 five\iv five\5 five\6 v\7 v\eight v\9 vi v\ii 5\4 v\v five\6 v\vii v\8 v\nine seven v\2 five\iii v\four v\v v\half-dozen five\vii v\nine 8 5\2 v\3 5\iv v\5 v\half dozen v\seven v\viii 9 5\2 five\3 v\four v\5 v\6 v\7 five\8 five\9 10 v\2 v\3 v\four 5\5 v\half-dozen v\7 v\8 v\9
All-time regards,
Stew
I take read that to load a large CLOB, information technology'south better to utilize a lob locator for a persistent LOB
Practise you have a reference?
You ask: "Do you have a reference?"
I thought I had read something two days previously. I find nothing that fits my (imaginary?) memory, but I did detect this:
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/adlob/working-with-LOBs.html#GUID-F61CA075-8CB1-4E0B-B538-D886F6CAE263
"Before you can start writing data to a persistent LOB using ... PL/SQL ... , you must make the LOB cavalcade/attribute non-NULL.
You can brand a LOB column/attribute non-Aught by initializing the persistent LOB to empty, using an INSERT/UPDATE argument with the function ... EMPTY_CLOB for CLOBs and NCLOBs."
Wouldn't that be more than efficient than populating a temporary LOB in a temp tablespace, so copying that LOB to a normal tablespace?
Anyhow I wanted your opinion and at present I take information technology, thanks.
Best regards,
Stew
Different use cases I think hither - my demo just wanted a nice big clob to work with every bit source. Information technology is/was a transient thing, so I think a temporary clob makes sense.
Similarly, if I'm writing to a clob as part of an application simply ultimately I'll be discarding information technology, I'll go temporary.
Merely if I'one thousand building a clob that will persist forever, then I'll probably go permanent.
Why the caveat of "probably"? Because I might be doing x,000 operations on the clob before I reach my "last country", all of which would logged, have read consistent chunk memory etc. Then in that instance, I might go temporary for the 9,999 operations and transfer to permanent for that final performance to make information technology persist.
Thanks for the response. I tested this against my tables and it works very well!
glad we could help
Connor said: "Different use cases I recollect here - my demo simply wanted a nice large clob to work with every bit source. It is/was a transient thing, so I recall a temporary clob makes sense."
Yes, clearly. I was request nearly a different utilise case and should take stated that more explicitly.
"...I might be doing 10,000 operations on the clob before I accomplish my "terminal state", all of which would logged, have read consistent chunk retention etc. Then in that instance, I might go temporary for the nine,999 operations and transfer to permanent for that final operation to make it persist."
Good point! Thanks for the in depth answer, much appreciated.
All-time regards,
Stew
Source: https://asktom.oracle.com/pls/apex/asktom.search?tag=parsing-a-clob-field-with-csv-data-and-put-the-contents-into-it-proper-fields
0 Response to "Oracle Stored Procedure Read Parse Delimited File"
Post a Comment