Hi, I am in a need of a very robust (esp. fast in read, non-blocking in update) tree structure storage (95% trees are of depth <4, current max. is 12). We have 10k-100k trees now, millions in the future. I made many tests, benchmarks of usual operations, and after all, materialized path ('1.5.3' path notation) seems most promising. My last candidates for its storage are ltree and integer[]. So I am comparing the following benchmarking tables with exactly same data (5 regular trees - each node except leaves has 5 children, depth=9, total nodes=2.441.405, id numbered according to breadth-first traversal): TABLES: A/ integer[] CREATE TABLE test ( id SERIAL, lpath ltree, CONSTRAINT test_pkey PRIMARY KEY(id) ); CREATE INDEX test_idx1 ON test USING gist (lpath gist_ltree_ops); B/ ltree CREATE TABLE test ( id SERIAL, apath INTEGER[], CONSTRAINT test_pkey PRIMARY KEY(id) ); CREATE INDEX test_idx1 ON test USING gin (apath); Separate single-table dbases, vacuum(analyz)ed. TESTING MACHINE: Windows 7, postgres 9.3.0, 4GB RAM effective_cache_size = 2GB work_mem = 512MB shared_buffers = 1GB THE PROBLEM: My intuition says integer[] should not be much worse than ltree (rather the other way) but I am not able to reach such results. I believe I am making some trivial mistake rather than assuming false hypothesis. My general question is, what more can I do to get better performance. WHAT I DID: 1/ I checked gist index for integer[] via intarray extension. Query plans for <@ and @> operators do not use it (reported bug/feature). That's why I am using gin. 2/ Getting ancestors - same qplans, ltree slightly wins: A: select * from test where apath@>(select apath from test where id=1) Bitmap Heap Scan on test (cost=159.04..33950.48 rows=12206 width=60) (actual time=80.912..224.182 rows=488281 loops=1) Recheck Cond: (apath @> $0) Buffers: shared hit=18280 InitPlan 1 (returns $0) -> Index Scan using test_pkey on test test_1 (cost=0.43..8.45 rows=1 width=56) (actual time=0.022..0.023 rows=1 loops=1) Index Cond: (id = 1) Buffers: shared hit=4 -> Bitmap Index Scan on test_idx1 (cost=0.00..147.54 rows=12206 width=0) (actual time=76.896..76.896 rows=488281 loops=1) Index Cond: (apath @> $0) Buffers: shared hit=369 Total runtime: 240.408 ms B: select * from test where lpath<@(select lpath from test where id=1) Bitmap Heap Scan on test (cost=263.81..8674.72 rows=2445 width=83) (actual time=85.395..166.683 rows=488281 loops=1) Recheck Cond: (lpath <@ $0) Buffers: shared hit=22448 InitPlan 1 (returns $0) -> Index Scan using test_pkey on test test_1 (cost=0.43..8.45 rows=1 width=79) (actual time=0.024..0.025 rows=1 loops=1) Index Cond: (id = 1) Buffers: shared hit=4 -> Bitmap Index Scan on test_idx1 (cost=0.00..254.75 rows=2445 width=0) (actual time=83.029..83.029 rows=488281 loops=1) Index Cond: (lpath <@ $0) Buffers: shared hit=12269 Total runtime: 182.239 ms 3/ Getting chosen nodes (eo) with chosen ancestors (ea) - index[] performs very poorly, it's qplan uses additional Bitmap Heap Scan, still indices used in both cases. A: select * from test eo where id in (select generate_series(3, 3000000, 5000)) and exists ( select 1 from test ea where ea.id in(select generate_series(1000, 3000, 3)) and ea.apath <@ eo.apath ) Nested Loop Semi Join (cost=140.10..1302851104.53 rows=6103 width=60) (actual time=1768.862..210525.597 rows=104 loops=1) Buffers: shared hit=8420909 -> Nested Loop (cost=17.94..1652.31 rows=1220554 width=60) (actual time=0.382..17.255 rows=489 loops=1) Buffers: shared hit=2292 -> HashAggregate (cost=17.51..19.51 rows=200 width=4) (actual time=0.352..1.486 rows=600 loops=1) -> Result (cost=0.00..5.01 rows=1000 width=0) (actual time=0.009..0.100 rows=600 loops=1) -> Index Scan using test_pkey on test eo (cost=0.43..8.15 rows=1 width=60) (actual time=0.017..0.021 rows=1 loops=600) Index Cond: (id = (generate_series(3, 3000000, 5000))) Buffers: shared hit=2292 -> Hash Semi Join (cost=122.16..1133.92 rows=6103 width=56) (actual time=430.482..430.482 rows=0 loops=489) Hash Cond: (ea.id = (generate_series(1000, 3000, 3))) Buffers: shared hit=8418617 -> Bitmap Heap Scan on test ea (cost=94.65..251.23 rows=12206 width=60) (actual time=271.034..430.278 rows=8 loops=489) Recheck Cond: (apath <@ eo.apath) Rows Removed by Index Recheck: 444335 Buffers: shared hit=8418617 -> Bitmap Index Scan on test_idx1 (cost=0.00..91.60 rows=12206 width=0) (actual time=155.355..155.355 rows=488281 loops=489) Index Cond: (apath <@ eo.apath) Buffers: shared hit=237668 -> Hash (cost=15.01..15.01 rows=1000 width=4) (actual time=0.305..0.305 rows=667 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB -> Result (cost=0.00..5.01 rows=1000 width=0) (actual time=0.003..0.116 rows=667 loops=1) Total runtime: 210526.004 ms B: select * from test eo where id in (select generate_series(3, 3000000, 5000)) and exists ( select 1 from test ea where ea.id in(select generate_series(1000, 3000, 3)) and ea.lpath @> eo.lpath ) Nested Loop Semi Join (cost=45.86..276756955.40 rows=1223 width=83) (actual time=2.985..226.161 rows=104 loops=1) Buffers: shared hit=27675 -> Nested Loop (cost=17.94..1687.51 rows=1222486 width=83) (actual time=0.660..5.987 rows=489 loops=1) Buffers: shared hit=2297 -> HashAggregate (cost=17.51..19.51 rows=200 width=4) (actual time=0.632..1.008 rows=600 loops=1) -> Result (cost=0.00..5.01 rows=1000 width=0) (actual time=0.007..0.073 rows=600 loops=1) -> Index Scan using test_pkey on test eo (cost=0.43..8.33 rows=1 width=83) (actual time=0.007..0.007 rows=1 loops=600) Index Cond: (id = (generate_series(3, 3000000, 5000))) Buffers: shared hit=2297 -> Hash Semi Join (cost=27.92..242.30 rows=1223 width=79) (actual time=0.449..0.449 rows=0 loops=489) Hash Cond: (ea.id = (generate_series(1000, 3000, 3))) Buffers: shared hit=25378 -> Index Scan using test_idx1 on test ea (cost=0.41..43.43 rows=2445 width=83) (actual time=0.060..0.445 rows=8 loops=489) Index Cond: (lpath @> eo.lpath) Buffers: shared hit=25378 -> Hash (cost=15.01..15.01 rows=1000 width=4) (actual time=0.178..0.178 rows=667 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB -> Result (cost=0.00..5.01 rows=1000 width=0) (actual time=0.003..0.071 rows=667 loops=1) Total runtime: 226.308 ms 3.a/ If I turn off the index for B the query is much faster: Nested Loop Semi Join (cost=35.88..20535547247.01 rows=6103 width=60) (actual time=17.257..1112.276 rows=104 loops=1) Join Filter: (ea.apath <@ eo.apath) Rows Removed by Join Filter: 287529 Buffers: shared hit=1155469 -> Nested Loop (cost=17.94..1652.31 rows=1220554 width=60) (actual time=0.334..5.307 rows=489 loops=1) Buffers: shared hit=2292 -> HashAggregate (cost=17.51..19.51 rows=200 width=4) (actual time=0.297..0.598 rows=600 loops=1) -> Result (cost=0.00..5.01 rows=1000 width=0) (actual time=0.008..0.088 rows=600 loops=1) -> Index Scan using test_pkey on test eo (cost=0.43..8.15 rows=1 width=60) (actual time=0.007..0.007 rows=1 loops=600) Index Cond: (id = (generate_series(3, 3000000, 5000))) Buffers: shared hit=2292 -> Nested Loop (cost=17.94..1652.31 rows=1220554 width=56) (actual time=0.004..1.946 rows=588 loops=489) Buffers: shared hit=1153177 -> HashAggregate (cost=17.51..19.51 rows=200 width=4) (actual time=0.001..0.089 rows=588 loops=489) -> Result (cost=0.00..5.01 rows=1000 width=0) (actual time=0.005..0.103 rows=667 loops=1) -> Index Scan using test_pkey on test ea (cost=0.43..8.15 rows=1 width=60) (actual time=0.002..0.003 rows=1 loops=287633) Index Cond: (id = (generate_series(1000, 3000, 3))) Buffers: shared hit=1153177 Total runtime: 1112.411 ms 3.b/ With the index on, if I go down to effective_cache_size = 256MB, B also skips the index usage, same qplan as in 3.a is used. Still the index is used for both versions of query 2 and ltree version of query 3. QUESTIONS: 1/ Is my hypothesis about similar performance of ltree and integer[] correct? 2/ If so, what should I do to get it? 3/ Is there a way to improve the performance of <@ and @> operators? In fact, as I am having a tree path in the column, I only need to check if path_a 'starts_with' path_b to get ancestors/descendants. Therefore something more effective than 'contains' might be used. Is there any way? 4/ Do I understand properly that index on integer[] is much more memory-consuming, and therefore there are differences in query plans / execution times? Thanks for any help, Jan |