Search Postgresql Archives

jsonpath duplication result

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, 

I am looking at the documentation and how jsonpath works and I fall on the following example:

(On PostgreSQL 12.4, package for Debian 12.4-3):

SELECT jsonb_path_query(j, '$.**.a.**.b') FROM (SELECT '{"a":{"a":{"b":{"c":3}}}'::jsonb as j) as T;

The result:

 jsonb_path_query 
------------------
 {"c": 3}
 {"c": 3}
(2 lignes)


>From what in understand, it matches twice the last `b` with every `a` before. For instance, adding one more 'a' on the tree:

 SELECT jsonb_path_query(j, '$.**.a.**.b') FROM (SELECT '{"a":{"a":{"a":{"b":{"c":3}}}}}'::jsonb as j) as T;
 jsonb_path_query 
------------------
 {"c": 3}
 {"c": 3}
 {"c": 3}

This could be a fine semantic for jsonpath query (one could call that the Bag semantic), 
however it implies that in some situation the number of output duplicated can become huge.
Actually in the worst case, it could be exponential in the depth of the tree.

On more example (for the fun):

SELECT Count(*) FROM (SELECT jsonb_path_query(j, '$.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.b') FROM (SELECT '{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"b":{"c":3}}}}}}}}}}}}}}}}}}}}}}}}}}}}}'::jsonb as j) as T) as T2;

Takes 24s a returns 20058300 rows.

This semantic for jsonpath is not so classical (other implementation choice to return only once each selected element, in the SET semantic spirit) 
and differs from the semantic of xpath, as shown here, which follows the SET semantic:

SELECT xpath('//a//a//b', '<a><a><a><b>1</b></a></a></a>');

Returns: 

   xpath    
------------
 {<b>1</b>}


We can of course get the appropriate semantic by adding DISTINCT, but the initial list of rows is computed nevertheless (a O(n) algorithm exists, so it is avoidable).

I also wonder if it is problematic to have simple small query like that that can be design to make the database work endlessly?

Best, 

Charles Paperman
Maitre de Conférence à l'université de Lille






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux