> -----Original Message----- > From: Tom Lane <tgl@xxxxxxxxxxxxx> > Sent: Sunday, August 29, 2021 12:19 > To: ldh@xxxxxxxxxxxxxxxxxx > Cc: Ranier Vilela <ranier.vf@xxxxxxxxx>; Andrew Dunstan > <andrew@xxxxxxxxxxxx>; Justin Pryzby <pryzby@xxxxxxxxxxxxx>; pgsql- > performance@xxxxxxxxxxxxxx > Subject: Re: Big Performance drop of Exceptions in UDFs between V11.2 > and 13.4 > > "ldh@xxxxxxxxxxxxxxxxxx" <ldh@xxxxxxxxxxxxxxxxxx> writes: > > Is it possible that the client I am using or the way I am creating the test > database might affect this scenario? I use DBeaver and use the default > settings to create the database: > > - default encoding: UTF8 > > - collate: English_United States.1252 > > - ctype: English_United States.1252 > > Yeah, I was thinking of quizzing you about that. I wonder whether > something is thinking it needs to transcode to WIN1252 encoding and > then back to UTF8, based on the .1252 property of the LC_XXX settings. > That shouldn't account for any 500X factor either, but we're kind of > grasping at straws here. > > Does Windows have any locale choices that imply UTF8 encoding > exactly, and if so, do your results change when using that? Alternatively, > try creating a database with WIN1252 encoding and those locale > settings. > > regards, tom lane Yeah, grasping at straws... and no material changes 😊 This is mystifying. show lc_messages; -- English_United States.1252 create table sampletest (a varchar, b varchar); insert into sampletest (a, b) select substr(md5(random()::text), 0, 15), (100000000*random())::integer::varchar from generate_series(1,100000); CREATE OR REPLACE FUNCTION toFloat(str varchar, val real) RETURNS real AS $$ BEGIN RETURN case when str is null then val else str::real end; EXCEPTION WHEN OTHERS THEN RETURN val; END; $$ LANGUAGE plpgsql COST 1 IMMUTABLE; explain (analyze,buffers,COSTS,TIMING) select MAX(toFloat(a, null)) as "a" from sampletest --Aggregate (cost=1477.84..1477.85 rows=1 width=4) (actual time=89527.032..89527.033 rows=1 loops=1) -- Buffers: shared hit=647 -- -> Seq Scan on sampletest (cost=0.00..1197.56 rows=56056 width=32) (actual time=0.024..37.811 rows=100000 loops=1) -- Buffers: shared hit=637 --Planning: -- Buffers: shared hit=24 --Planning Time: 0.347 ms --Execution Time: 89527.501 ms explain (analyze,buffers,COSTS,TIMING) select MAX(toFloat(b, null)) as "b" from sampletest --Aggregate (cost=2137.00..2137.01 rows=1 width=4) (actual time=186.605..186.606 rows=1 loops=1) -- Buffers: shared hit=637 -- -> Seq Scan on sampletest (cost=0.00..1637.00 rows=100000 width=8) (actual time=0.008..9.679 rows=100000 loops=1) -- Buffers: shared hit=637 --Planning: -- Buffers: shared hit=4 --Planning Time: 0.339 ms --Execution Time: 186.641 ms At this point, I am not sure how to proceed except to rethink that toFloat() function and many other places where we use exceptions. We get such dirty data that I need a "safe" way to convert a string to float without throwing an exception. BTW, I tried other combinations in case there may have been some weird interactions with the ::REAL conversion operator, but nothing made any change. Could you recommend another approach off the top of your head? I could use regexes for testing etc... Or maybe there is another option like a no-throw conversion that's built in or in some extension that you may know of? Like the "SAFE." Prefix in BigQuery. Thank you, Laurent.