Hi Laurenz,
My bad, I forgot to write that I tried vacuum too, but it didn't help. To demonstrate the result, I did it again:
# vacuum (analyze, verbose) test_db_bench_1;
INFO: vacuuming "perfkit.public.test_db_bench_1"
INFO: launched 2 parallel vacuum workers for index cleanup (planned: 2)
INFO: finished vacuuming "perfkit.public.test_db_bench_1": index scans: 0
pages: 0 removed, 684731 remain, 17510 scanned (2.56% of total)
tuples: 0 removed, 3999770 remain, 0 are dead but not yet removable
removable cutoff: 27200203, which was 0 XIDs old when operation ended
index scan bypassed: 7477 pages from table (1.09% of total) have 20072 dead item identifiers
avg read rate: 0.099 MB/s, avg write rate: 0.009 MB/s
buffer usage: 27770 hits, 11 misses, 1 dirtied
WAL usage: 1 records, 1 full page images, 1762 bytes
system usage: CPU: user: 0.15 s, system: 0.71 s, elapsed: 0.87 s
INFO: vacuuming "perfkit.pg_toast.pg_toast_16554"
INFO: finished vacuuming "perfkit.pg_toast.pg_toast_16554": index scans: 0
pages: 0 removed, 0 remain, 0 scanned (100.00% of total)
tuples: 0 removed, 0 remain, 0 are dead but not yet removable
removable cutoff: 27200203, which was 0 XIDs old when operation ended
new relfrozenxid: 27200203, which is 4000060 XIDs ahead of previous value
index scan not needed: 0 pages from table (100.00% of total) had 0 dead item identifiers removed
avg read rate: 113.225 MB/s, avg write rate: 0.000 MB/s
buffer usage: 3 hits, 1 misses, 0 dirtied
WAL usage: 1 records, 0 full page images, 188 bytes
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: analyzing "public.test_db_bench_1"
INFO: "test_db_bench_1": scanned 30000 of 684731 pages, containing 175085 live rows and 897 dead rows; 30000 rows in sample, 3996204 estimated total rows
VACUUM
# vacuum (analyze, verbose) test_db_bench_tenants;
INFO: vacuuming "perfkit.public.test_db_bench_tenants"
INFO: launched 2 parallel vacuum workers for index cleanup (planned: 2)
INFO: finished vacuuming "perfkit.public.test_db_bench_tenants": index scans: 0
pages: 0 removed, 78154 remain, 1 scanned (0.00% of total)
tuples: 0 removed, 4064008 remain, 0 are dead but not yet removable
removable cutoff: 27200204, which was 0 XIDs old when operation ended
new relfrozenxid: 27200204, which is 2 XIDs ahead of previous value
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 34 hits, 0 misses, 0 dirtied
WAL usage: 1 records, 0 full page images, 188 bytes
system usage: CPU: user: 0.01 s, system: 0.08 s, elapsed: 0.10 s
INFO: analyzing "public.test_db_bench_tenants"
INFO: "test_db_bench_tenants": scanned 30000 of 78154 pages, containing 1560000 live rows and 0 dead rows; 30000 rows in sample, 4064008 estimated total rows
VACUUM
# vacuum (analyze, verbose) test_db_bench_tenant_closure;
INFO: vacuuming "perfkit.public.test_db_bench_tenant_closure"
INFO: launched 1 parallel vacuum worker for index cleanup (planned: 1)
INFO: finished vacuuming "perfkit.public.test_db_bench_tenant_closure": index scans: 0
pages: 0 removed, 181573 remain, 3808 scanned (2.10% of total)
tuples: 0 removed, 28505125 remain, 0 are dead but not yet removable
removable cutoff: 27200205, which was 0 XIDs old when operation ended
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 97.907 MB/s
buffer usage: 7680 hits, 0 misses, 3803 dirtied
WAL usage: 3800 records, 2 full page images, 224601 bytes
system usage: CPU: user: 0.08 s, system: 0.21 s, elapsed: 0.30 s
INFO: analyzing "public.test_db_bench_tenant_closure"
INFO: "test_db_bench_tenant_closure": scanned 30000 of 181573 pages, containing 4709835 live rows and 0 dead rows; 30000 rows in sample, 28505962 estimated total rows
VACUUM
Limit (cost=1.98..152.05 rows=1 width=44) (actual time=0.012..0.013 rows=0 loops=1)
-> Nested Loop (cost=1.98..1052.49 rows=7 width=44) (actual time=0.011..0.012 rows=0 loops=1)
-> Nested Loop (cost=1.55..1022.18 rows=7 width=37) (actual time=0.011..0.011 rows=0 loops=1)
-> Nested Loop (cost=1.12..1019.03 rows=7 width=8) (actual time=0.011..0.011 rows=0 loops=1)
-> Index Scan using test_db_bench_tenants_uuid on test_db_bench_tenants tenants_parent (cost=0.56..2.77 rows=1 width=8) (actual time=0.010..0.010 rows=0 loops=1)
Index Cond: ((uuid)::text = '4c79c1c5-21ae-45a0-8734-75d67abd0330'::text)
Filter: (NOT is_deleted)
-> Index Scan using test_db_bench_tenant_closure_pkey on test_db_bench_tenant_closure tenants_closure (cost=0.56..1006.97 rows=929 width=16) (never executed)
Index Cond: (parent_id = tenants_parent.id)
Filter: (barrier <= 0)
-> Index Scan using test_db_bench_tenants_pkey on test_db_bench_tenants tenants_child (cost=0.43..0.45 rows=1 width=45) (never executed)
Index Cond: (id = tenants_closure.child_id)
Filter: (NOT is_deleted)
-> Index Scan using test_db_bench_1_idx_tenant_id_3 on test_db_bench_1 (cost=0.43..2.98 rows=135 width=44) (never executed)
Index Cond: ((tenant_id)::text = (tenants_child.uuid)::text)
Planning Time: 0.874 ms
Execution Time: 0.053 ms
(17 rows)
The planning time even increased :)
Played around with the indexes:
Firstly I dropped all the indexes that contained tenant_id field, except the one that is used in the execution plan:
DROP INDEX test_db_bench_1_idx_type_16;
DROP INDEX test_db_bench_1_idx_queue_18;
DROP INDEX test_db_bench_1_idx_queue_5;
After that:
Planning Time: 0.889 ms
Execution Time: 0.047 ms
DROP INDEX test_db_bench_1_idx_uuid_0;
Planning Time: 0.841 ms
Execution Time: 0.047 ms
DROP INDEX test_db_bench_1_idx_completion_time_ns_1;
DROP INDEX test_db_bench_1_idx_cti_entity_uuid_2;
DROP INDEX test_db_bench_1_idx_enqueue_time_ns_10;
Planning Time: 0.830 ms
Execution Time: 0.048 ms
DROP INDEX test_db_bench_1_idx_euc_id_4;
DROP INDEX test_db_bench_1_idx_policy_id_12;
DROP INDEX test_db_bench_1_idx_queue_19;
Planning Time: 0.826 ms
Execution Time: 0.044 ms
DROP INDEX test_db_bench_1_idx_queue_6;
DROP INDEX test_db_bench_1_idx_resource_id_11;
DROP INDEX test_db_bench_1_idx_resource_id_14;
Planning Time: 0.821 ms
Execution Time: 0.048 ms
DROP INDEX test_db_bench_1_idx_result_code_13;
DROP INDEX test_db_bench_1_idx_start_time_ns_9;
DROP INDEX test_db_bench_1_idx_state_8;
Planning Time: 0.803 ms
Execution Time: 0.044 ms
DROP INDEX test_db_bench_1_idx_type_15;
DROP INDEX test_db_bench_1_idx_type_17;
DROP INDEX test_db_bench_1_idx_update_time_ns_7;
At that moment only 3 indexes left on the table and a slight improvements in Planning Time:
Indexes:
"test_db_bench_1_pkey" PRIMARY KEY, btree (id)
"test_db_bench_1_idx_tenant_id_3" btree (tenant_id)
"test_db_bench_1_uuid_key" UNIQUE CONSTRAINT, btree (uuid)
Planning Time: 0.799 ms
Execution Time: 0.044 ms
I.e. the situation is still not good - almost all indexes have been removed, the planning time has been reduced insignificantly and it still remains much longer than the query execution time.
As for the stats - default_statistics_target has not been changed, has a value of 100, and no explicit settings for the columns have been applied ("Stats target" is empty).
Could it be a regression? I'll check it on PG14 when I get a chance.
--
Mikhail
On Mon, 11 Sept 2023 at 09:15, Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote:
On Fri, 2023-09-08 at 18:51 +0800, Mikhail Balayan wrote:
> I have three tables:
> - test_db_bench_1
> - test_db_bench_tenants
> - test_db_bench_tenant_closure
>
> And the query to join them:
> SELECT "test_db_bench_1"."id" id, "test_db_bench_1"."tenant_id"
> FROM "test_db_bench_1"
> JOIN "test_db_bench_tenants" AS "tenants_child" ON (("tenants_child"."uuid" = "test_db_bench_1"."tenant_id")
> AND ("tenants_child"."is_deleted" != true))
> JOIN "test_db_bench_tenant_closure" AS "tenants_closure" ON (("tenants_closure"."child_id" = "tenants_child"."id")
> AND ("tenants_closure"."barrier" <= 0))
> JOIN "test_db_bench_tenants" AS "tenants_parent" ON (("tenants_parent"."id" = "tenants_closure"."parent_id")
> AND ("tenants_parent"."uuid" IN ('4c79c1c5-21ae-45a0-8734-75d67abd0330'))
> AND ("tenants_parent"."is_deleted" != true))
> LIMIT 1
>
>
> With following execution plan:
>
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> ---------------
> Limit (cost=1.56..1.92 rows=1 width=44) (actual time=0.010..0.011 rows=0 loops=1)
> -> Nested Loop (cost=1.56..162.42 rows=438 width=44) (actual time=0.009..0.009 rows=0 loops=1)
> -> Nested Loop (cost=1.13..50.27 rows=7 width=36) (actual time=0.008..0.009 rows=0 loops=1)
> -> Nested Loop (cost=0.84..48.09 rows=7 width=8) (actual time=0.008..0.009 rows=0 loops=1)
> -> Index Scan using test_db_bench_tenants_uuid on test_db_bench_tenants tenants_parent (cost=0.41..2.63 rows=1 width=8) (actual time=0.008..0.008 rows=0 loops=1)
> Index Cond: ((uuid)::text = '4c79c1c5-21ae-45a0-8734-75d67abd0330'::text)
> Filter: (NOT is_deleted)
> -> Index Scan using test_db_bench_tenant_closure_pkey on test_db_bench_tenant_closure tenants_closure (cost=0.42..45.06 rows=40 width=16) (never executed)
> Index Cond: (parent_id = tenants_parent.id)
> Filter: (barrier <= 0)
> -> Index Scan using test_db_bench_tenants_pkey on test_db_bench_tenants tenants_child (cost=0.29..0.31 rows=1 width=44) (never executed)
> Index Cond: (id = tenants_closure.child_id)
> Filter: (NOT is_deleted)
> -> Index Scan using test_db_bench_1_idx_tenant_id_3 on acronis_db_bench_heavy (cost=0.43..14.66 rows=136 width=44) (never executed)
> Index Cond: ((tenant_id)::text = (tenants_child.uuid)::text)
> Planning Time: 0.732 ms
> Execution Time: 0.039 ms
>
>
> Where the planning time gets in the way as it takes an order of magnitude more time than the actual execution.
>
> Is there a possibility to reduce this time? And, in general, to understand why planning takes so much time.
You could try to VACUUM the involved tables; indexes with many entries pointing to dead tuples
can cause a long planing time.
Also, there are quite a lot of indexes on "test_db_bench_1". On a test database, drop some
indexes and see if that makes a difference.
Finally, check if "default_statistics_target" is set to a high value, or if the "Stats target"
for some column in the "\d+ tablename" output is set higher than 100.
Yours,
Laurenz Albe