Hi Geoff, Adrian and Tom,
thanks for your responses so far. Excuse my late response. I will respond to Tom's mail as it covers most points: On 26.02.2017 17:50, Tom Lane wrote: There are multiple reasons why the text-to-datetime conversion functions are not immutable: * some of them depend on the current timezone (but I don't believe date_in does); * all of them depend on the current datestyle setting, eg to resolve '02/03/2017'; * all of them accept strings with time-varying values, such as 'now' or 'today'. You could get around the second and third points with to_timestamp(), but since the only variant of that is one that yields timestamptz and hence is affected by the timezone setting, it's still not immutable. I understand that timezone settings can have serious consequences when parsing text to datetime. My conceptual issue is that wrapping an "unsafe" operation up into a function and **marking** it as "safe" is not making things safer. Basically by-passing security guards. So, what can I do to parse texts to date(times) in a safe manner? I'd like to do it the right way. I can safely provide the timezone for those dates but it won't be in the jsonb data. I'm not entirely sure why the OP feels he needs an index on this _expression_. If he's willing to restrict the column to have the exact format 'YYYY-MM-DD', then a regular textual index would sort the same anyway. Perhaps what's needed is just to add a CHECK constraint verifying that the column has that format. These were my reasons: 1) sanity checks (already noted) 2) index date ranges (using gist) 3) maybe performance (comparing texts vs comparing dates) but I couldn't think of ways to test this That's the current schema: Table "public.docs" Column | Type | Modifiers --------+---------+--------------------------------------------------- id | integer | not null default nextval('docs_id_seq'::regclass) meta | jsonb | Indexes: "docs_pkey" PRIMARY KEY, btree (id) "docs_address_idx" gin (to_tsvector('english'::regconfig, meta ->> 'address'::text)) "docs_address_trgm_idx" gin ((meta ->> 'address'::text) gin_trgm_ops) "docs_birthdate_idx" btree ((meta ->> 'birthdate'::text)) "docs_meta_idx" gin (meta jsonb_path_ops) "docs_name_idx" gin (to_tsvector('english'::regconfig, meta ->> 'name'::text)) Thanks to the ISO date format, I got by with a btree index on birthdate as Tom suggested. The index supports queries like the following (although 22secs still is not great on 10M rows) explain analyze select meta->>'birthdate' from docs where meta->>'birthdate' > '2000-01-01' and meta->>'birthdate' < '2000-12-31' order by meta->>'birthdate'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Index Scan using docs_birthdate_idx on docs (cost=0.43..46067.43 rows=50000 width=136) (actual time=2.118..22177.710 rows=209955 loops=1) Index Cond: (((meta ->> 'birthdate'::text) > '2000-01-01'::text) AND ((meta ->> 'birthdate'::text) < '2000-12-31'::text)) Planning time: 0.205 ms Execution time: 22229.615 ms Regard, Sven |