Wim Bertels schrieb am 17.03.2023 um 11:05: > what would be the general idea: "easily" convert an hierarchical > structure like json or xml to a table; for example creating columns by > appending the key-names when going doing down the three, using null for > empty values, adding more columns as needed by the given structure. > (1-way operation) > > a few conceptual gists: > jsonX= > { > "glossary": { > "title": "example glossary", > "GlossDiv": { > "title": "S", > "GlossList": { > "GlossEntry": { > "ID": "SGML", > "SortAs": "SGML", > "GlossTerm": "Sta.."; > "Acronym": "SGML", > "Abbrev": "ISO 8879:1986", > "GlossDef": { > "para": "A meta-m..", > "GlossSeeAlso": > ["GML", "XML"] > }, > "GlossSee": "markup" > } > } > } > } > } > > select * > from json_to_table('jsonX'); > -- generated columns with no data/only nulls could be removed.. > -- arrays could be unnested in the process as well > > glossary | glossary.title | glossary.title.GlossDiv.title | .. > ------------------------------------------------------------- > null | example glossary | S | > .. > > the last column: glossary.title.GlossDiv.GlossList.GlossEntry.GlossSee > with value "markup" > > --- > > what if there are different structures that need to be combined? > (they could be added in the same manner as before) > > jsonY= > { > s1:[{ > "f1": "a", > "f2": "b", > "f3": { "f3.1": "c", > "f3.2": "d"} > }, > { > "f1": "e", > "f4": "g" > } > ] > } > > select * > from json_to_table('jsonY'); > -- generated columns with no data/only nulls could be removed.. > -- separator sign is untrusted > > s1 | s1.f1 | s1.f2 | s1.f3 | s1.f3.f3.1 | s1.f3.f3.2 | s1.f4 > ------------------------------------------------------------- > null| a | b | null | c | d | null > null| e | null | null | null | null | g You can't have a function that returns a different set of columns each time you call it (without specifying the output columns - which you don't want). I have once written a function to flatten a JSON hierarchy to multiple rows. Applied to your first example it would return the following: path | key | value -------------------------------------------------+--------------+----------------- /glossary | title | example glossary /glossary/GlossDiv | title | S /glossary/GlossDiv/GlossList/GlossEntry | ID | SGML /glossary/GlossDiv/GlossList/GlossEntry | Abbrev | ISO 8879:1986 /glossary/GlossDiv/GlossList/GlossEntry | SortAs | SGML /glossary/GlossDiv/GlossList/GlossEntry | Acronym | SGML /glossary/GlossDiv/GlossList/GlossEntry | GlossSee | markup /glossary/GlossDiv/GlossList/GlossEntry | GlossTerm | Sta.. /glossary/GlossDiv/GlossList/GlossEntry/GlossDef | para | A meta-m.. /glossary/GlossDiv/GlossList/GlossEntry/GlossDef | GlossSeeAlso | ["GML", "XML"] And the following for the second example: path | key | value -------+------+------ /s1 | f1 | a /s1 | f2 | b /s1/f3 | f3.1 | c /s1/f3 | f3.2 | d /s1 | f1 | e /s1 | f4 | g Thomas
create or replace function flatten(p_input jsonb, p_path text) returns table(path text, key text, value text) as $$ begin if jsonb_typeof(p_input) = 'array' then return query select f.* from jsonb_array_elements(p_input) as a(element) cross join flatten(a.element, p_path) f; else return query select p_path, e.key, e.value #>> '{}' from jsonb_each(p_input) as e(key, value) where jsonb_typeof(e.value) not in ('object', 'array') or (jsonb_typeof(e.value) = 'array' and jsonb_typeof(e.value -> 0) <> 'object') union all select f.* from jsonb_each(p_input) as t(key,value) cross join flatten(t.value, p_path||'/'||t.key) as f where jsonb_typeof(t.value) = 'object' union all select f.* from jsonb_each(p_input) as t(key,value) cross join jsonb_array_elements(t.value) as a(element) cross join flatten(a.element, p_path||'/'||t.key) as f where jsonb_typeof(t.value) = 'array' and jsonb_typeof(t.value -> 0) = 'object'; end if; end; $$ language plpgsql immutable parallel safe ; create or replace function flatten(p_input jsonb) returns table(path text, key text, value text) as $$ select * from flatten(p_input, ''); $$ language sql immutable parallel safe ;