Yes, we have primary key and another index that include bigint and uuid that stored in database as varchar(64), many sessions inserting ... values and at the same time, many sessions running the select query. table size 13GB, primary is 6GB, the other index is 13GB too, the select query use the 13GB index only scan.
in addition to this table and query, during the workload, we have other two hot tables with similar bigint and uuid composite index, inserting by many sessions and select in many sessions too, but the table size is smaller, table 3790MB, index 1445MB.
when the workload started, database server cpu got increased very fast and in tens of seconds, cpu is almost 100% used, and then buffermapping LWLock started for these selects , even the inserts started see buffermapping contention too.
attached please find top 1 table and sql details, another two tables and it's query are similar.
Thanks,
James
David Mullineux <dmullx@xxxxxxxxx> 於 2024年12月22日週日 上午12:41寫道:
Depends on a lot of thongs...Visibility map sounds like it's impacted here. Are your inserts towards the index (like a monotonically increasing serial id) or scattered around the index values ? How big is the table index and shared buffers ? An example would really help
On Sat, 21 Dec 2024, 11:51 James Pang, <jamespang886@xxxxxxxxx> wrote:Hi,we have a simple select .... from table where ... (that mache the index) , table has 80million rows. when many application sessions run the query and at the same time some other sessions doing insert into ... this table. from pg_stat_statements, shared_blks_hit show 31652 / per call. we see very high cpu almost 100% cpu during application workload test, and high LWLock BufferMapping waiting for these querys. But manually run the sql show only 2148 shared_blks_hit/ per call. this is a simple sql, from pg_profile we did see it use same index scan as manually running. What could be possible reason leading so big difference with shared_blks_hit ?PGv14.8Thanks,James
Table "testapp.testablename" Column | Type | Collation | Nullable | Default ------------------+--------------------------------+-----------+----------+------------------- lmapuuid | character varying(64) | | not null | sssid | bigint | | not null | evctid | bigint | | not null | lmapparameter | character varying(4000) | | not null | createtime | timestamp(0) without time zone | | not null | clock_timestamp() lmdtime | timestamp(0) without time zone | | not null | clock_timestamp() Indexes: "pk_testablename" PRIMARY KEY, btree (lmapuuid, sssid) "idx_testablename_evctid_1" btree (sssid, evctid, lmapparameter, lmapuuid) Publications: "testapp_10001" Triggers: tr_testablename BEFORE DELETE ON testablename FOR EACH ROW WHEN (USER <> 'batcha'::name OR USER = 'batchb'::name) EXECUTE FUNCTION trigger_fct_tr_testablename() tr_testablename_lmt BEFORE UPDATE ON testablename FOR EACH ROW WHEN (USER !~~ 'batch%'::text) EXECUTE FUNCTION trigger_fct_common_lmt() table and index size select relname,reltuples,relpages,relpages*8/1024 "size(mb)" from pg_class where relname='xxx'; relname | reltuples | relpages size(mb) ----------------------+---------------+---------- testablename | 8.2382416e+07 | 1761908 13764 pk_testablename | 8.2382416e+07 | 729228 5697 idx_testablename_evctid_1 | 8.2382416e+07 | 1742734 13615 show shared_buffers; shared_buffers ---------------- 65369MB (1 row) sssid, evctid are bigint , lmapuuid is UUID and stored in database are varchar(64). SELECT sssid,lmapuuid AS UUID ,evctid,lmapparameter AS lllparm FROM testablename WHERE evctid = $1 AND sssid = $2 AND lmapparameter LIKE '%x.abc%' AND ((lmapparameter LIKE '%ED=%' AND lmapparameter LIKE '%AID=%') OR (lmapparameter LIKE '%J=%' AND lmapparameter LIKE '%RT=%') OR ((LINKPARAMETER LIKE '%ED=%' OR (LINKPARAMETER LIKE '%J=%' )) AND lmapparameter LIKE '%UJ=1%')) AND lmapparameter NOT LIKE '%disabled%' AND lmapparameter NOT LIKE '%IEH=%' FETCH FIRST $3 ROWS ONLY Limit (cost=0.82..8.63 rows=31 width=116) (actual time=0.023..14.371 rows=7 loops=1) Output: sssid, lmapuuid, evctid, lmapparameter Buffers: shared hit=2148 -> Index Only Scan using idx_testablename_evctid_1 on testablename (cost=0.82..30.81 rows=119 width=116) (actual time=0.022..14.368 row s=7 loops=1) Output: sssid, lmapuuid, evctid, lmapparameter Index Cond: ((testablename.sssid = 'xxxxxxx'::bigint) AND (testablename.evctid = 'xxxxxxx'::bigint)) Filter: (((testablename.lmapparameter)::text ~~ '%x.abc%'::text) AND ((testablename.lmapparameter)::text !~~ '%disabled%'::text) AND ((testablename.lmapparameter)::text !~~ '%IEH=%'::text) AND ((((testablename.lmapparameter)::text ~~ '%ED=%'::text) AND ((mtgconferencelinkm ap.lmapparameter)::text ~~ '%AID=%'::text)) OR (((testablename.lmapparameter)::text ~~ '%J=%'::text) AND ((testablename.lmapparameter)::text ~~ '%RT=%'::text)) OR ((((testablename.lmapparameter)::text ~~ '%ED=%'::text) OR ((testablename.lmapparameter)::text ~~ '%J=%'::text)) AND (( testablename.lmapparameter)::text ~~ '%UJ=1%'::text)))) Rows Removed by Filter: 36155 Heap Fetches: 260 Buffers: shared hit=2148 Query Identifier: -4801201989067066267 Planning Time: 0.318 ms Execution Time: 14.391 ms above are test from psql manual running with prepared statement, running many times, it show shared_hit=2148 , index only scan but with heap fetches. below are pg_stat_statements snaptshot query about this SQL, some sessions insert into this table in parallel, and at the same time, many sessions run this select statement, the shared_blks_hit/calls show a huge number thant tens of times of manually running test 2148. pg_stat_statements query avgsblks=shared_blks_hit/calls, avgdirtied=shared_blks_dirtied/call, avgread=shared_blks_read/calls, avgfpi=fpi/calls, avgwal=walbytes/calls sample_time | calls | min_exec_time | mean_exec_time | max_exec_time | avgsblks | avgdirtied | avgread | avgfpi | avgwal | avgrtime | avgwtime ------------------------+-------+----------------------+--------------------+--------------------+----------+------------+---------+--------+---------------- --------+--------------------+---------- 2024-12-20 02:45:20+00 | 1236 | 0.022557 | 0.50104122815534 | 1.377047 | 810 | 0 | 0 | 0 | 0.3462783171521 0355987 | 0 | 0 2024-12-20 02:55:31+00 | 11435 | 0.017463 | 4.32600441696545 | 200.30973999999998 | 5272 | 0 | 0 | 0 | 0.0696108439003 0607783 | 0 | 0 2024-12-20 03:15:59+00 | 16035 | 0.020143 | 115.54667212641019 | 15157.307508 | 20735 | 2 | 1 | 0 | 2269.353102588 0885563 | 0.8249557188649828 | 0 2024-12-20 03:25:10+00 | 147 | 0.018453999999999998 | 144.3084889591837 | 198.37187500000002 | 206505 | 0 | 0 | 0 | 1.537414965 9863946 | 0 | 0 2024-12-20 05:55:38+00 | 56 | 0.028886000000000002 | 201.53043271428575 | 838.709004 | 196878 | 0 | 0 | 80 | 329936.73214 2857143 | 0 | 0 2024-12-20 06:35:28+00 | 7344 | 0.025979 | 694.054243656318 | 10391.441267 | 36162 | 0 | 0 | 0 | 117.151279956 4270153 | 0 | 0 2024-12-20 07:05:59+00 | 124 | 0.056067 | 161.0531775241936 | 167.077285 | 235815 | 0 | 0 | 0 | 1.032258064 in addtion to this table and query, the workload doing inserting to another two tables in many sessions, and running query select with the index in many sessions too. the other two tables have similar primary and indexes that having bigint and uuid columns, the query use the index that have bigint, uuid, varchar ,3 column composite index. table size 3790MB, index size 1445MB.