Re: Scaling SELECT:s with the number of disks on a stripe

Peter Schuller <peter.schuller@xxxxxxxxxxxx> · Wed, 4 Apr 2007 08:01:16 +0200

Hello,

> The next question then is whether anything in your postgres configuration
> is preventing it getting useful performance from the OS. What settings
> have you changed in postgresql.conf?

The only options not commented out are the following (it's not even
tweaked for buffer sizes and such, since in this case I am not
interested in things like sort performance and cache locality other
than as an afterthought):

hba_file = '/etc/postgresql/8.1/main/pg_hba.conf'
ident_file = '/etc/postgresql/8.1/main/pg_ident.conf'
external_pid_file = '/var/run/postgresql/8.1-main.pid'
listen_addresses = '*'
port = 5432
max_connections = 100
unix_socket_directory = '/var/run/postgresql'
ssl = true
shared_buffers = 1000
log_line_prefix = '%t '
stats_command_string = on
stats_row_level = on
autovacuum = on
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'

> Are you using any unusual settings within the OS itself?

No. It's a pretty standard kernel. The only local tweaking done is
enabling/disabling various things; there are no special patches used
or attempts to create a minimalistic kernel or anything like that.

> You're forgetting the LIMIT clause. For the straight index scan, the
> query aborts when the LIMIT is reached having scanned only the specified
> number of index rows (plus any index entries that turned out to be dead
> in the heap). For the bitmap scan case, the limit can be applied only after
> the heap scan is under way, therefore the index scan to build the bitmap
> will need to scan ~50k rows, not the 10k specified in the limit, so the
> amount of time spent scanning the index is 50 times larger than in the
> straight index scan case.

Ok - makes sense that it has to scan the entire subset of the index
for the value in question. I will have to tweak the CPU/disk costs
settings (which I have, on purpose, not yet done).

> However, I do suspect you have a problem here somewhere, because in my
> tests the time taken to do the bitmap index scan on 50k rows, with the
> index in cache, is on the order of 30ms (where the data is cached in
> shared_buffers) to 60ms (where the data is cached by the OS). That's on
> a 2.8GHz xeon.

This is on a machine with 2.33GHz xeons and I wasn't trying to
exaggerate. I timed it and it is CPU bound (in userspace; next to no
system CPU usage at all) for about 15 seconds for the case of
selecting with a limit of 10000.

Given that there is no disk activity I can't imagine any buffer sizes
or such affecting this other than userspace vs. kernelspace CPU
concerns (since obviously the data being worked on is in RAM). Or am I
missing something?

It is worth noting that the SELECT of fewer entries is entirely disk
bound; there is almost no CPU usage whatsoever. Even taking the
cumulative CPU usage into account (gut feeling calculation, nothing
scientific) and multiplying by 50 you are nowhere near 15 seconds of
CPU boundness. So it is indeed strange.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@xxxxxxxxxxxx>'
Key retrieval: Send an E-Mail to getpgpkey@xxxxxxxxx
E-Mail: peter.schuller@xxxxxxxxxxxx Web: http://www.scode.org

Attachment:
pgpzBydXeyQ9z.pgp

Description: PGP signature