On Thu, 18 Sep 2008, Greg Smith wrote:
It's just that our system is doing a lot of bitmap index scans at the
moment, and it'd help to be able to spread them across the 16 discs in the
RAID array. It's the bottleneck in our system at the moment.
If you have some specific bitmap index scan test case suggestions you can
pass along (either publicly or in private to me, I can probably help
anonymize them), that's one of the things that has been holding this up.
Okay, here's a description of what we're doing. We are importing data from
a large file (so we don't have a choice on the order that the data comes
in). For each entry in the source, we have to look up the corresponding
row in the database, and issue the correct "UPDATE" command according to
what's in the database and what's in the data source. Long story - it
isn't feasible to change the overall process.
In order to improve the performance, I made the system look ahead in the
source, in groups of a thousand entries, so instead of running:
SELECT * FROM table WHERE field = 'something';
a thousand times, we now run:
SELECT * FROM table WHERE field IN ('something', 'something else'...);
with a thousand things in the IN. Very simple query. It does run faster
than the individual queries, but it still takes quite a while. Here is an
example query:
SELECT a1_.id AS a1_id, a1_.primaryIdentifier AS a2_ FROM Gene AS a1_
WHERE a1_.primaryIdentifier IN ('SPAC11D3.15', 'SPAC11D3.16c',
'SPAC11D3.17', 'SPAC11D3.18c', 'SPAC15F9.01c', 'SPAC15F9.02', 'SPAC16.01',
'SPAC18G6.01c', 'SPAC18G6.02c', 'SPAC18G6.04c', 'SPAC18G6.05c',
'SPAC18G6.06', 'SPAC18G6.07c', 'SPAC18G6.09c', 'SPAC18G6.10',
'SPAC18G6.11c', 'SPAC18G6.12c', 'SPAC18G6.13', 'SPAC18G6.14c',
'SPAC18G6.15', 'SPAC1B9.01c', 'SPAC1D4.02c', 'SPAC1D4.03c', 'SPAC1D4.04',
'SPAC1D4.05c', 'SPAC1D4.07c', 'SPAC1D4.08', 'SPAC1D4.09c', 'SPAC1D4.10',
'SPAC1D4.11c', 'SPAC1F3.11', 'SPAC23A1.10', 'SPAC23E2.01', 'SPAC23E2.02',
'SPAC23E2.03c', 'SPAC26A3.02', 'SPAC26A3.03c', 'SPAC26A3.05',
'SPAC26A3.06', 'SPAC26A3.07c', 'SPAC26A3.08', 'SPAC26A3.09c',
'SPAC26A3.10', 'SPAC26A3.11', 'SPAC26A3.14c', 'SPAC26A3.15c',
'SPAC26A3.16', 'SPAC27F1.01c', 'SPAC27F1.03c', 'SPAC27F1.04c',
'SPAC27F1.05c', 'SPAC27F1.06c', 'SPAC3H8.02', 'SPAC3H8.03', 'SPAC3H8.04',
'SPAC3H8.05c', 'SPAC3H8.06', 'SPAC3H8.07c', 'SPAC3H8.08c', 'SPAC3H8.09c',
'SPAC3H8.10', 'SPAC3H8.11', 'SPAC8E11.11', 'SPBC106.15', 'SPBC17G9.10',
'SPBC24E9.15c', 'WBGene00000969', 'WBGene00003035', 'WBGene00004095',
'WBGene00016011', 'WBGene00018672', 'WBGene00018674', 'WBGene00018675',
'WBGene00018676', 'WBGene00018959', 'WBGene00018960', 'WBGene00018961',
'WBGene00023407') ORDER BY a1_.id LIMIT 2000;
And the corresponding EXPLAIN ANALYSE:
Limit (cost=331.28..331.47 rows=77 width=17) (actual time=121.973..122.501 rows=78 loops=1)
-> Sort (cost=331.28..331.47 rows=77 width=17) (actual time=121.968..122.152 rows=78 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 29kB
-> Bitmap Heap Scan on gene a1_ (cost=174.24..328.87 rows=77 width=17) (actual time=114.311..121.705 rows=78 loops=1)
Recheck Cond: (primaryidentifier = ANY ('{SPAC11D3.15...
-> Bitmap Index Scan on gene__key_primaryidentifier (cost=0.00..174.22 rows=77 width=0) (actual time=44.434..44.434 rows=150 loops=1)
Index Cond: (primaryidentifier = ANY ('{SPAC11D3.15,SPAC11D3.16c...
Total runtime: 122.733 ms
(9 rows)
Although it's probably in the cache, as it took 1073 ms the first time.
The table has half a million rows, but tables all over the database are
being accessed, so the cache is shared between several hundred million
rows.
Postgres executes this query in two stages. First it does a trawl of the
index (on field), and builds an bitmap. Then it fetches the pages
according to the bitmap. I can see the second stage being quite easy to
adapt for fadvise, but the first stage would be a little more tricky. Both
stages are equally important, as they take a comparable amount of time.
We are running this database on a 16-spindle RAID array, so the benefits
to our process of fully utilising them would be quite large. I'm
considering if I can parallelise things a little though.
Alternately, if you'd like to join in on testing this all out more help would
certainly be welcome.
How would you like me to help?
Matthew
--
What goes up must come down. Ask any system administrator.