Search Postgresql Archives

Re: Is it possible to index "deep" into a JSONB column?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Imre,

I'm gradually working my way into the combination of SQL, JSONB and
jsonpath that this involves even without the indexing, but this looks
very helpful/promising, especially on the indexing. Thanks a lot for
the input,

Shaheed

On Mon, 30 May 2022 at 11:50, Imre Samu <pella.samu@xxxxxxxxx> wrote:
>
> > In principle, I believe this allows index-assisted access to keys and
> > values nested in arrays and inner objects but in practice, it seems
> > the planner "often" decides to ignore the index in favour of a table scan.
>
> part II.  index usage ;  see  the "Bitmap Index Scan on jpqarr_idx"
>
> SET enable_seqscan = OFF;
> select * from jsonb_table
>   where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('1971-02-02'::TEXT);
> ;
> +----+---------------------------------------------------------------------------------------------------------------+
> | id |                                                   jsonb_col                                                   |
> +----+---------------------------------------------------------------------------------------------------------------+
> |  2 | {"stuff": {}, "employee": {"7011": {"date_of_birth": "1970-01-01"}, "7122": {"date_of_birth": "1971-02-02"}}} |
> +----+---------------------------------------------------------------------------------------------------------------+
> (1 row)
>
>
> EXPLAIN ANALYZE select * from jsonb_table
>   where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('1971-02-02'::TEXT);
> ;
> +---------------------------------------------------------------------------------------------------------------------------------------------------------+
> |                                                                       QUERY PLAN                                                                        |
> +---------------------------------------------------------------------------------------------------------------------------------------------------------+
> | Bitmap Heap Scan on jsonb_table  (cost=3.00..4.52 rows=1 width=36) (actual time=0.056..0.059 rows=1 loops=1)                                            |
> |   Recheck Cond: (jsonb_path_query_array(jsonb_col, '$."employee".*."date_of_birth"'::jsonpath, '{}'::jsonb, false) @> to_jsonb('1971-02-02'::text))     |
> |   Heap Blocks: exact=1                                                                                                                                  |
> |   ->  Bitmap Index Scan on jpqarr_idx  (cost=0.00..3.00 rows=1 width=0) (actual time=0.026..0.027 rows=1 loops=1)                                       |
> |         Index Cond: (jsonb_path_query_array(jsonb_col, '$."employee".*."date_of_birth"'::jsonpath, '{}'::jsonb, false) @> to_jsonb('1971-02-02'::text)) |
> | Planning Time: 0.255 ms                                                                                                                                 |
> | Execution Time: 0.122 ms                                                                                                                                |
> +---------------------------------------------------------------------------------------------------------------------------------------------------------+
> (7 rows)
>
> regards,
>   Imre
>
>
> Imre Samu <pella.samu@xxxxxxxxx> ezt írta (időpont: 2022. máj. 30., H, 12:30):
>>
>> Hi Shaheed,
>>
>> > WHAT GOES HERE
>>
>> imho check the:   jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' )
>>
>> may example:
>>
>> CREATE TABLE jsonb_table (
>>     id serial primary key,
>>     jsonb_col JSONB
>> );
>>
>> INSERT INTO jsonb_table(jsonb_col)
>> VALUES
>>   ('{"stuff": {},"employee": {"8011":  {"date_of_birth": "1980-01-01"},"8222": {"date_of_birth": "1982-02-02"}}}'),
>>   ('{"stuff": {},"employee": {"7011":  {"date_of_birth": "1970-01-01"},"7122": {"date_of_birth": "1971-02-02"}}}'),
>>   ('{"stuff": {},"employee": {"a12":   {"date_of_birth": "2000-01-01"},"b56":  {"date_of_birth": "2000-02-02"}}}')
>> ;
>>
>> select jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' )  from jsonb_table;
>>
>> -- create index
>> create index jpqarr_idx
>>   on jsonb_table
>>   using gin ( jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' ) );
>> -- tests:
>> select id from jsonb_table
>>   where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('2000-01-01'::TEXT);
>> ;
>> DROP TABLE
>> CREATE TABLE
>> INSERT 0 3
>> +------------------------------+
>> |    jsonb_path_query_array    |
>> +------------------------------+
>> | ["1980-01-01", "1982-02-02"] |
>> | ["1970-01-01", "1971-02-02"] |
>> | ["2000-01-01", "2000-02-02"] |
>> +------------------------------+
>> (3 rows)
>>
>> CREATE INDEX
>> +----+
>> | id |
>> +----+
>> |  3 |
>> +----+
>> (1 row)
>>
>>
>> Regards,
>>  Imre
>>
>> Shaheed Haque <shaheedhaque@xxxxxxxxx> ezt írta (időpont: 2022. máj. 29., V, 22:53):
>>>
>>> Suppose I have a JSONB field called "snapshot". I can create a GIN
>>> index on it like this:
>>>
>>>   create index idx1 on mytable using gin (snapshot);
>>>
>>> In principle, I believe this allows index-assisted access to keys and
>>> values nested in arrays and inner objects but in practice, it seems
>>> the planner "often" decides to ignore the index in favour of a table
>>> scan. (As discussed elsewhere, this is influenced by the number of
>>> rows, and possibly other criteria too).
>>>
>>> Now, I know it is possible to index inner objects, so that is snapshot
>>> looks like this:
>>>
>>> {
>>>     "stuff": {},
>>>     "more other stuff": {},
>>>     "employee": {
>>>          "1234": {"date_of_birth": "1970-01-01"},
>>>          "56B789": {"date_of_birth": "1971-02-02"},
>>>     }
>>> }
>>>
>>> I can say:
>>>
>>>   create index idx2 on mytable using gin ((snapshot -> 'employee'));
>>>
>>> But what is the syntax to index only on date_of_birth? I assume a
>>> btree would work since it is a primitive value, but WHAT GOES HERE in
>>> this:
>>>
>>>   create index idx3 on mytable using btree ((snapshot ->'employee' ->
>>> WHAT GOES HERE -> 'date_of_birth'));
>>>
>>> I believe an asterisk "*" would work if 'employee' was an array, but
>>> here it is  nested object with keys. If it helps, the keys are
>>> invariably numbers (in quoted string form, as per JSON).
>>>
>>> Thanks, Shaheed
>>>
>>>






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux