Oracle Stored Procedure Read Parse Delimited File

Alternative inspired by Stew Ashton JSON_TABLE usage

Kim Berg Hansen, June 06, 2018 - 10:12 am UTC

Hiya,

Since proper noun/value pairs are very similar JSON (but hither using dissimilar delimiters), I got inspired past Stew Ashton using JSON_TABLE to split strings: https://stewashton.wordpress.com/2018/06/05/splitting-strings-a-new-champion/

I'm building a exam table with 10 rows each with a CLOB with 90 (3*xxx) proper noun/value pairs.
Then I use some REPLACE to turn the name=value~ format into JSON.
And then information technology can exist parsed with JSON_TABLE.

create table T ( id integer, c clob );  declare   v varchar2(32000);   ctemp clob; brainstorm   dbms_lob.createtemporary(ctemp,truthful);   for j in 1 .. 10 loop      dbms_lob.trim(ctemp,0);      for i in 1 .. thirty loop        five := 'pid'||i||'='||(j*10000000+i)||'~'          || 'username'||i||'='||rpad(j||i||'abc',j*10+10*i,'xyz')||'~'          || 'attr'||i||'='||rpad(j||i||'xyz',j*5+5*i,'abc')||'~';        dbms_lob.writeappend(ctemp,length(v),v);      end loop;      insert into T values (j, ctemp);   end loop;   dbms_lob.freetemporary(ctemp);   commit; terminate; /  select id, dbms_lob.getlength(c) len from t;          ID        LEN ---------- ----------          one       8448          2       8898          three       9348          4       9798          5      10248          6      10698          7      11148          8      11598          ix      12048         10      12528  ten rows selected.  set linesize 132 set pagesize 50 column id format 99 column j_obj format a100 set long 100  -- Plough into JSON  select id, '{"' || replace(replace(supercede(rtrim(c,'~'), '"', '\"'), '=', '":"'), '~', '","') || '"}' j_obj from t;   ID J_OBJ --- ----------------------------------------------------------------------------------------------------   one {"pid1":"10000001","username1":"11abcxyzxyzxyzxyzxyz","attr1":"11xyzabcab","pid2":"10000002","userna   2 {"pid1":"20000001","username1":"21abcxyzxyzxyzxyzxyzxyzxyzxyzx","attr1":"21xyzabcabcabca","pid2":"xx   iii {"pid1":"30000001","username1":"31abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxy","attr1":"31xyzabcabcabcabc   4 {"pid1":"40000001","username1":"41abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz","attr1":"41xyzab   five {"pid1":"50000001","username1":"51abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzx","attr1   6 {"pid1":"60000001","username1":"61abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz   vii {"pid1":"70000001","username1":"71abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz   8 {"pid1":"80000001","username1":"81abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz   9 {"pid1":"90000001","username1":"91abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz  x {"pid1":"100000001","username1":"101abcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzx  10 rows selected.  column username1  format a25 column attr1      format a25 cavalcade username30 format a25 column attr30     format a25  -- Utilize in JSON_TABLE (substr just for SQL*Plus output)  select t.id      , jt.pid1      , substr(jt.username1 , 1, 25) as username1      , substr(jt.attr1     , 1, 25) as attr1        /* etc etc etc */      , jt.pid30      , substr(jt.username30, 1, 25) equally username30      , substr(jt.attr30    , 1, 25) as attr30   from t      , json_table(           '{"' || supercede(supplant(replace(rtrim(c,'~'), '"', '\"'), '=', '":"'), '~', '","') || '"}'         , '$'           columns (              pid1       number         path '$.pid1'            , username1  varchar2(4000) path '$.username1'            , attr1      varchar2(4000) path '$.attr1'              /* etc etc etc */            , pid30      number         path '$.pid30'            , username30 varchar2(4000) path '$.username30'            , attr30     varchar2(4000) path '$.attr30'           )        ) jt ;   ID       PID1 USERNAME1                 ATTR1                          PID30 USERNAME30             ATTR30 --- ---------- ------------------------- ------------------------- ---------- ------------------------- -------------------------   ane   10000001 11abcxyzxyzxyzxyzxyz      11xyzabcab                  10000030 130abcxyzxyzxyzxyzxyzxyzx 130xyzabcabcabcabcabcabca   ii   20000001 21abcxyzxyzxyzxyzxyzxyzxy 21xyzabcabcabca             20000030 230abcxyzxyzxyzxyzxyzxyzx 230xyzabcabcabcabcabcabca   3   30000001 31abcxyzxyzxyzxyzxyzxyzxy 31xyzabcabcabcabcabc        30000030 330abcxyzxyzxyzxyzxyzxyzx 330xyzabcabcabcabcabcabca   4   40000001 41abcxyzxyzxyzxyzxyzxyzxy 41xyzabcabcabcabcabcabcab   40000030 430abcxyzxyzxyzxyzxyzxyzx 430xyzabcabcabcabcabcabca   5   50000001 51abcxyzxyzxyzxyzxyzxyzxy 51xyzabcabcabcabcabcabcab   50000030 530abcxyzxyzxyzxyzxyzxyzx 530xyzabcabcabcabcabcabca   6   60000001 61abcxyzxyzxyzxyzxyzxyzxy 61xyzabcabcabcabcabcabcab   60000030 630abcxyzxyzxyzxyzxyzxyzx 630xyzabcabcabcabcabcabca   seven   70000001 71abcxyzxyzxyzxyzxyzxyzxy 71xyzabcabcabcabcabcabcab   70000030 730abcxyzxyzxyzxyzxyzxyzx 730xyzabcabcabcabcabcabca   8   80000001 81abcxyzxyzxyzxyzxyzxyzxy 81xyzabcabcabcabcabcabcab   80000030 830abcxyzxyzxyzxyzxyzxyzx 830xyzabcabcabcabcabcabca   9   90000001 91abcxyzxyzxyzxyzxyzxyzxy 91xyzabcabcabcabcabcabcab   90000030 930abcxyzxyzxyzxyzxyzxyzx 930xyzabcabcabcabcabcabca  10  100000001 101abcxyzxyzxyzxyzxyzxyzx 101xyzabcabcabcabcabcabca  100000030 1030abcxyzxyzxyzxyzxyzxyz 1030xyzabcabcabcabcabcabc  10 rows selected.

Hope that'southward useful ;-)

Cheerio
/Kim

Connor McDonald

June 07, 2018 - 1:26 am UTC

nice stuff.

More on JSON_TABLE

Stew Ashton, June xi, 2018 - 8:37 pm UTC

Thanks to Kim for the reference!

Marc Bleron blogged nearly CSV CLOBs and JSON_TABLE two years agone:
https://odieweblog.wordpress.com/2016/04/22/simple-csv-parsing-using-xmltable-or-json_table/
but he used JSON arrays, not objects.

The OP said "up to 60 attributes" so I suppose some attributes may be missing in some records. I thought I'd emulate that, and also put the attributes in random lodge, merely to emphasize what Kim has already shown. To make things easier for me, I'g going to add newlines later each record.

First, I have a question: I have read that to load a large CLOB, it's better to apply a lob locator for a persistent LOB. Is this code OK? (See the two lines after BEGIN)

create table t(c clob);  declare   l_clob clob; begin   insert into t values(empty_clob()) returning c into l_clob;   dbms_lob.open(l_clob, dbms_lob.lob_readwrite);   for rec in (     with data every bit (       select level+i north, 'a'||(level+1) || '=' || 'v\'||(level+ane) || '~' str       from dual       connect by level <= 9     )     , randoms equally (       select level+1 n, floor(dbms_random.value(ii,9.9999)) r       from dual       connect past level <= 100     )     select 'pid='||(r.n-1)||'~'||       listagg(str) within group(lodge past dbms_random.random)       || chr(10) str     from information d, randoms r     where d.north != r.r     group by r.n   ) loop     dbms_lob.writeappend(l_clob, length(rec.str), rec.str);   stop loop;   dbms_lob.shut(l_clob);   commit; exception when others and then   if dbms_lob.isopen(l_clob) = ane then     dbms_lob.close(l_clob);   cease if;   raise; terminate; / set long 200 select substr(c,ane,200) from t;   SUBSTR(C,1,200)                                                                  -------------------------------------------------------------------------------- pid=ane~a5=v\5~a2=five\2~a3=5\3~a10=v\10~a7=v\seven~a8=v\viii~a6=five\six~a4=five\4~a9=v\9~ pid=ii~a6=v\6~a4=v\4~a8=v\8~a3=v\3~a5=five\v~a2=v\ii~a10=five\ten~ pid=iii~a4=five\4~a7=five\vii~a10=v\x~a8=v\8~a5=v\five~a9=v\nine~a2=five\2~ pid=4~a4=v\4

Now, here is a pipelined table function that will break the CLOB down into VARCHAR2-size chunks for processing. I think this will scale much better for big volumes.

create or replace function pipe_clob (   p_clob in clob,   p_max_lengthb in integer default 4000,   p_rec_term in varchar2 default ' ' ) render sys.odcivarchar2list pipelined authid current_user as   l_amount integer;   l_offset integer;   l_buffer varchar2(4000 CHAR);   l_out varchar2(4000 BYTE);   l_buff_lengthb integer;   l_occurence integer;   l_rec_term_length integer := length(p_rec_term); begin   l_amount := p_max_lengthb;   l_offset := 1;   while l_amount = p_max_lengthb loop     begin       DBMS_LOB.READ (        p_clob,        l_amount,        l_offset,        l_buffer       );     exception when no_data_found so       l_amount := 0;     end;     if l_amount > 0 then       l_buff_lengthb := p_max_lengthb + 1;       l_occurence := 0;       while l_buff_lengthb > p_max_lengthb loop         l_occurence := l_occurence + 1;         l_buff_lengthb := instrb(l_buffer,p_rec_term,-1, l_occurence);       end loop;       if l_buff_lengthb = 0 then         l_buff_lengthb := lengthb(l_buffer);       terminate if;       l_out := substrb(l_buffer, 1, l_buff_lengthb-l_rec_term_length);       piping row(l_out);       l_offset := l_offset + length(l_out) + l_rec_term_length;     end if;   end loop;   return; end; /

Subsequently that:
- I call JSON_ARRAY on each clamper to escape characters if necessary.
- then I turn each record into an object, but put all the objects inside an array.
- JSON_TABLE breaks down the array and puts each value in the right column.

with json_data as (   select   supervene upon(     replace(       replace(         '[{'||substr(json_array(column_value),2,length(json_array(column_value))-four)||'"}]',         '~\n',         '"},{"'       ),       '=',       '":"'     ),     '~',     '","'   )    jstr   from pipe_clob((select c from t), 3300) ) select r.* from json_data j, json_table(   j.jstr, '$[*]' columns     pid varchar2(9) path '$.pid',     a2 varchar2(nine) path '$.a2',     a3 varchar2(9) path '$.a3',     a4 varchar2(9) path '$.a4',     a5 varchar2(9) path '$.a5',     a6 varchar2(9) path '$.a6',     a7 varchar2(9) path '$.a7',     a8 varchar2(9) path '$.a8',     a9 varchar2(9) path '$.a9' ) r where rownum <= 10;  PID       A2        A3        A4        A5        A6        A7        A8        A9        --------- --------- --------- --------- --------- --------- --------- --------- --------- 1         v\2       five\3       5\4       v\five       v\6       v\7       five\8       v\9       two         v\2       v\3       v\4       v\five       v\6                 v\eight                 iii         v\2                 five\iv       v\5                 five\vii       v\8       five\9       iv         v\2       v\3       5\four       v\5       v\6       v\7       v\8       five\9       5         v\2       v\3       five\iv       five\5       five\6       v\7       v\eight       v\9       vi         v\ii                 5\4       v\v       five\6       v\vii       v\8       v\nine       seven         v\2       five\iii       v\four       v\v       v\half-dozen       five\vii                 v\nine       8         5\2       v\3       5\iv       v\5       v\half dozen       v\seven       v\viii                 9         5\2       five\3       v\four       v\5       v\6       v\7       five\8       five\9       10        v\2       v\3       v\four       5\5       v\half-dozen       v\7       v\8       v\9

All-time regards,
Stew

Connor McDonald

June 12, 2018 - 1:46 am UTC

I take read that to load a large CLOB, information technology'south better to utilize a lob locator for a persistent LOB

Practise you have a reference?

The Disappearing Reference

Stew Ashton, June 12, 2018 - 12:39 pm UTC

You ask: "Do you have a reference?"

I thought I had read something two days previously. I find nothing that fits my (imaginary?) memory, but I did detect this:

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/adlob/working-with-LOBs.html#GUID-F61CA075-8CB1-4E0B-B538-D886F6CAE263

"Before you can start writing data to a persistent LOB using ... PL/SQL ... , you must make the LOB cavalcade/attribute non-NULL.

You can brand a LOB column/attribute non-Aught by initializing the persistent LOB to empty, using an INSERT/UPDATE argument with the function ... EMPTY_CLOB for CLOBs and NCLOBs."

Wouldn't that be more than efficient than populating a temporary LOB in a temp tablespace, so copying that LOB to a normal tablespace?

Anyhow I wanted your opinion and at present I take information technology, thanks.

Best regards,
Stew

Connor McDonald

June 13, 2018 - 2:05 am UTC

Different use cases I think hither - my demo just wanted a nice big clob to work with every bit source. Information technology is/was a transient thing, so I think a temporary clob makes sense.

Similarly, if I'm writing to a clob as part of an application simply ultimately I'll be discarding information technology, I'll go temporary.

Merely if I'one thousand building a clob that will persist forever, then I'll probably go permanent.

Why the caveat of "probably"? Because I might be doing x,000 operations on the clob before I reach my "last country", all of which would logged, have read consistent chunk memory etc. Then in that instance, I might go temporary for the 9,999 operations and transfer to permanent for that final performance to make information technology persist.

thank y'all

Robert, June 12, 2018 - half-dozen:55 pm UTC

Thanks for the response. I tested this against my tables and it works very well!

Connor McDonald

June 13, 2018 - 1:59 am UTC

glad we could help

Thanks

Stew Ashton, June thirteen, 2018 - 4:35 am UTC

Connor said: "Different use cases I recollect here - my demo simply wanted a nice large clob to work with every bit source. It is/was a transient thing, so I recall a temporary clob makes sense."

Yes, clearly. I was request nearly a different utilise case and should take stated that more explicitly.

"...I might be doing 10,000 operations on the clob before I accomplish my "terminal state", all of which would logged, have read consistent chunk retention etc. Then in that instance, I might go temporary for the nine,999 operations and transfer to permanent for that final operation to make it persist."

Good point! Thanks for the in depth answer, much appreciated.

All-time regards,
Stew

mayberryadlyinit.blogspot.com

Source: https://asktom.oracle.com/pls/apex/asktom.search?tag=parsing-a-clob-field-with-csv-data-and-put-the-contents-into-it-proper-fields

Oracle Stored Procedure Read Parse Delimited File

Alternative inspired by Stew Ashton JSON_TABLE usage

More on JSON_TABLE

The Disappearing Reference

thank y'all

Thanks

0 Response to "Oracle Stored Procedure Read Parse Delimited File"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel