I might be completely wrong, and would really like to understand the details, in order to explain them to my team, and to other who might
encounter the same problem.
Interne Envoyé : vendredi 5 juillet 2024 04:05 À : FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT) <simon.freyburger@xxxxxxx>; pgsql-performance@xxxxxxxxxxxxxxxxxxxx; Peter Geoghegan <pg@xxxxxxx> Objet : Re: How to solve my slow disk i/o throughput during index scan On 7/4/24 22: 23, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT) wrote: > Hello, > > Thank you, splitting in “OR” query definitely
enables bitmap heap scans, > and thus parallelized read to disk On 7/4/24 22:23, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE
TGV / DM RMP YIELD MANAGEMENT) wrote:
> Hello,
>
> Thank you, splitting in “OR” query definitely enables bitmap heap scans,
> and thus parallelized read to disk 😃! I though did not understand your > second point, what is parallel append, and how to enable it ?
Just for example:
DROP TABLE IF EXISTS t CASCADE;
CREATE TABLE t (id int not null, payload text) PARTITION BY RANGE (id);
CREATE TABLE p1 PARTITION OF t FOR VALUES FROM (0) TO (1000);
CREATE TABLE p2 PARTITION OF t FOR VALUES FROM (1000) TO (2000);
CREATE TABLE p3 PARTITION OF t FOR VALUES FROM (2000) TO (3000);
CREATE TABLE p4 PARTITION OF t FOR VALUES FROM (3000) TO (4000);
INSERT INTO t SELECT x % 4000, repeat('a',128) || x FROM
generate_series(1,1E5) AS x;
ANALYZE t;
SET enable_parallel_append = on;
SET parallel_setup_cost = 0.00001;
SET parallel_tuple_cost = 0.00001;
SET max_parallel_workers_per_gather = 8;
SET min_parallel_table_scan_size = 0;
SET min_parallel_index_scan_size = 0;
EXPLAIN (COSTS OFF)
SELECT t.id, t.payload FROM t WHERE t.id % 2 = 0
GROUP BY t.id, t.payload;
Group
Group Key: t.id, t.payload
-> Gather Merge
Workers Planned: 6
-> Sort
Sort Key: t.id, t.payload
-> Parallel Append
-> Parallel Seq Scan on p1 t_1
Filter: ((id % 2) = 0)
-> Parallel Seq Scan on p2 t_2
Filter: ((id % 2) = 0)
-> Parallel Seq Scan on p3 t_3
Filter: ((id % 2) = 0)
-> Parallel Seq Scan on p4 t_4
Filter: ((id % 2) = 0)
Here the table is scanned in parallel. It also works with IndexScan.
--
regards, Andrei Lepikhov
-------
Ce message et toutes les pièces jointes sont établis à l'intention exclusive de ses destinataires et sont confidentiels. L'intégrité de ce message n'étant pas assurée sur Internet, la SNCF ne peut être tenue responsable des altérations qui pourraient se produire sur son contenu. Toute publication, utilisation, reproduction, ou diffusion, même partielle, non autorisée préalablement par la SNCF, est strictement interdite. Si vous n'êtes pas le destinataire de ce message, merci d'en avertir immédiatement l'expéditeur et de le détruire. ------- This message and any attachments are intended solely for the addressees and are confidential. SNCF may not be held responsible for their contents whose accuracy and completeness cannot be guaranteed over the Internet. Unauthorized use, disclosure, distribution, copying, or any part thereof is strictly prohibited. If you are not the intended recipient of this message, please notify the sender immediately and delete it. |