Re: Why ORing with a false one-time filter turns an Index-Lookup into a SeqScan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Daniel,

Thanks a lot for the detailed analysis and the suggestions, however I
am not sure marking the proeprty read as immuble is safe - because I
intend to use it as STABLE function - constant during execution of one
statement but might return different values in different
statement-invokations.

What puzzles me is that without ORing both conditions, postgres
behaves as expected.
When just filtering with WHERE current_setting('my.wfsuser', true)=
'admin' it correctly detects this is a static condition, and either
reads all rows or omits reading the rows alltogether.
for filtering just WHERE owner = current_setting('my.wfsuser', true)
it treats the current_setting invokation as stable and does a normal
index lookup just taking a few ms.
but when combining both conditions with an OR everything seems to fall
appart, and instead of an index lookup / seqscan I always get the
(slow) seqscan.

It seems like the check on the stable value of check
current_setting('my.wfsuser', true)= 'admin' will somehow make the
index lookup unusesable - but i have no idea why :/

Best regards, Clemens



Am Mi., 29. Jan. 2025 um 08:43 Uhr schrieb Daniel Blanch Bataller
<daniel.blanch@xxxxxxxxxxxxxxxxx>:
>
> Better said:
>
> current_setting() is STABLE
>
> If a function is IMMUTABLE it is run only once per statement and it can be run during optimization phase, before it's planned or executed.
> If a function is STABLE it is run only once but after planning, during execution.
> if a function is VOLATILE it is run for every row during executiong time.
>
> Said so i would use a function for what you want, A function using plpgsql that returns all records if user is admin or returns just one record otherwise.
>
> Cheers.
>
>
> El mar, 28 ene 2025 a las 23:26, Daniel Blanch Bataller (<daniel.blanch@xxxxxxxxxxxxxxxxx>) escribió:
>>
>> I hope this gives you a clue of what it's going on:
>>
>> Functions can be marked as
>> INMUTABLE
>> STABLE
>> VOLATILE
>>
>> IMMUTABLE indicates that the function cannot modify the database and always returns the same result when given the same argument values; that is, it does not do database lookups or otherwise use information not directly present in its argument list. If this option is given, any call of the function with all-constant arguments can be immediately replaced with the function value.
>>
>> STABLE indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups, parameter variables (such as the current time zone), etc. (It is inappropriate for AFTER triggers that wish to query rows modified by the current command.) Also note that the current_timestamp family of functions qualify as stable, since their values do not change within a transaction.
>>
>> VOLATILE indicates that the function value can change even within a single table scan, so no optimizations can be made. Relatively few database functions are volatile in this sense; some examples are random(), currval(), timeofday(). But note that any function that has side-effects must be classified volatile, even if its result is quite predictable, to prevent calls from being optimized away; an example is setval().
>>
>>
>> random() for example is volatile. That is if you call to random() in a select, random is evaluated per each row.
>>
>>
>> test=# SELECT random() FROM generate_series(1, 3);
>>        random
>> ---------------------
>>  0.13773282234297923
>>   0.5954521821166239
>>  0.23865666511706607
>> (3 filas)
>>
>> Random is marked as volatile so it is executed for every row.
>>
>> test=# SELECT proname, provolatile
>> FROM pg_proc
>> WHERE proname = 'random';
>>  proname | provolatile
>> ---------+-------------
>>  random  | v
>>
>>
>> To make your function run only once, IMMUTABLE function type seems to be the type of that does the trick. (
>>
>> You can wrap current_setting in your own function and mark it as IMMUTABLE
>>
>> test=# CREATE OR REPLACE FUNCTION myfunction() RETURNS TEXT IMMUTABLE LANGUAGE sql AS $$ SELECT current_setting('my.username') $$;
>> CREATE FUNCTION
>>
>> Now if you use myfunction() it behaves as expected;
>>
>>
>> test=# SET my.username = 'admin';
>> SET
>> test=# EXPLAIN SELECT * FROM test WHERE id = 1 OR myfunction() = 'admin'; -- predicate is always true, all rows are evaluated
>>                        QUERY PLAN
>> ---------------------------------------------------------
>> Seq Scan on test  (cost=0.00..21.00 rows=1000 width=19)
>> (1 fila)
>>
>> test=# EXPLAIN SELECT * FROM test WHERE id = 1 OR myfunction() = 'other'; -- predicate is true only when id is 1
>>                               QUERY PLAN
>> -----------------------------------------------------------------------
>>  Index Scan using test_pkey on test  (cost=0.28..8.29 rows=1 width=19)
>>    Index Cond: (id = 1)
>> (2 filas)
>> (1 fila)
>>
>>
>> I hope this helps.
>>
>>
>> El mar, 28 ene 2025 a las 15:58, Clemens Eisserer (<linuxhippy@xxxxxxxxx>) escribió:
>>>
>>> Hello,
>>>
>>> Any idea what could cause postgresql (16.0) to fall back to a SeqScan
>>> when ORing a falsy one-time filter to a selection which would
>>> otherwise use an index scan?
>>>
>>> 1.) Without the false one-time condition, the query uses the existing
>>> index on owner to perform the lookup:
>>> select * from mytable where owner = current_setting('my.wfsuser', true);
>>> Bitmap Heap Scan on mytable  (cost=43.92..12523.30 rows=3548
>>> width=2341) (actual time=0.032..0.033 rows=0 loops=1)
>>>   Recheck Cond: ((owner)::text = current_setting('my.wfsuser'::text, true))
>>>   ->  Bitmap Index Scan on mytable_owner_idx  (cost=0.00..43.04
>>> rows=3548 width=0) (actual time=0.029..0.030 rows=0 loops=1)
>>>         Index Cond: ((owner)::text = current_setting('my.wfsuser'::text, true))
>>> Planning Time: 0.221 ms
>>> Execution Time: 0.094 ms
>>>
>>> 2.) also a static condition resulting in a false value is correctly recognized:
>>> select * from mytable  where current_setting('my.wfsuser'::text, true)
>>> = 'admin'::text;
>>> Result  (cost=0.01..158384.05 rows=709504 width=2341) (actual
>>> time=0.008..0.009 rows=0 loops=1)
>>>   One-Time Filter: (current_setting('my.wfsuser'::text, true) = 'admin'::text)
>>>   ->  Seq Scan on mytable  (cost=0.01..158384.05 rows=709504
>>> width=2341) (never executed)
>>> Planning Time: 0.163 ms
>>> Execution Time: 0.068 ms
>>>
>>> 3.) Yet when both filters are combined with OR, postgresql executes a SeqScan:
>>> select * from mytable where owner = current_setting('my.wfsuser',
>>> true) OR current_setting('my.wfsuser'::text, true) = 'admin'::text;
>>> Gather  (cost=1000.00..158909.23 rows=7077 width=2341) (actual
>>> time=2783.728..2786.520 rows=0 loops=1)
>>>   Workers Planned: 2
>>>   Workers Launched: 2
>>>   ->  Parallel Seq Scan on mytable  (cost=0.00..157201.53 rows=2949
>>> width=2341) (actual time=2744.147..2744.147 rows=0 loops=3)
>>>         Filter: (((owner)::text = current_setting('my.wfsuser'::text,
>>> true)) OR (current_setting('my.wfsuser'::text, true) = 'admin'::text))
>>>         Rows Removed by Filter: 236501
>>> Planning Time: 0.217 ms
>>> Execution Time: 2786.575 ms
>>>
>>> Thanks and best regards, Clemens
>>>
>>>






[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux