Re: SubtransControlLock and performance problems

Lars Aksel Opsahl <Lars.Opsahl@xxxxxxxx> · Thu, 20 Feb 2020 09:20:50 +0000

Hi

>From: 
Alvaro Herrera <alvherre@xxxxxxxxxxxxxxx>
>Sent: Wednesday, February 19, 2020 4:23 PM
>To: 
Lars Aksel 
Opsahl <Lars.Opsahl@nibio.no>
>Cc:
Laurenz 
Albe <laurenz.albe@cybertec.at>;
Pavel 
Stehule <pavel.stehule@gmail.com>; Tom Lane <tgl@sss.pgh.pa.us>;
pgsql-performance@lists.postgresql.org <pgsql-performance@lists.postgresql.org>
>Subject: Re: SubtransControlLock and performance problems
> 
>On 2020-Feb-19,
Lars 
Aksel Opsahl wrote:
>
>> With the values above I did see same performance problems and we ended
>> with a lot of subtransControlLock.
>> 
>> So I started to change the code based on your feedbacks.
>> 
>> - What seems to work very good in combination with a catch exception
>> and retry pattern is to insert the data in to separate table for each
>> job. (I the current
testcase we reduced the number of
>> subtransControlLock from many hundreds to almost none.)
>
>I think at this point your only recourse is to start taking profiles to
>see where the time is going. 
Without that, you're just flying blind and
>whatever you do will not necessarily move your needle at all.

Hi

Yes I totally agree with you and yes I have tried to do some profiling and testing while developing.

>From the worst case to best case the time is reduced 15 times (from 300 minutes to 20 minutes) when testing a small dataset for with 619230 surface (25909671 total line points) with the test below “resolve_overlap_gap_run('org_jm.jm_ukomm_flate','figurid','geo',4258,false,'test_topo_jm',0.000001,31,3000);
 “

The reason for this seems to be related to the problems described by Laurenz Albe related to how Postgres handles try and catch and sub transactions, which I did not know about. If we don't have this is mind and we start to get subtranslocks it seems to kill
 the performance in some cases.

In this test I ran with 31 parallel threads which is very high on a server with only 32 cores and maybe not realistic. I just did this now see what happens when
 I try to push a server to it’s limits and maximise the performance increase. If I reduce this to 1 single thread, there should be now difference and if run on
 16 threads the difference would much much smaller. 

I will now start to run on datasets which are 10 times bigger to check how thing scales, but then run with around maybe 28 parallel jobs. 

The two branches I have tested on now which should show the main difference are here.

https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology_using_TopoGeo_addLinestringwhich is the faster one.

https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology which is slower one, but here I have now added a check on number of subtranslocks before I
 kick of new jobs and that reduced time form 9 hours  to 3 hours.

Thanks.

Lars