Re: Sort performance on large tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'd set up a trigger to maintain summary tables perhaps...

Chris


Charlie Savage wrote:
Thanks everyone for the feedback.

I tried increasing work_mem:

set work_mem to 300000;

select tlid, min(ogc_fid)
from completechain
group by tld;

The results are:

"GroupAggregate  (cost=9041602.80..10003036.88 rows=48071704 width=8)
(actual time=4371749.523..5106162.256 rows=47599910 loops=1)"
"  ->  Sort  (cost=9041602.80..9161782.06 rows=48071704 width=8) (actual
time=4371690.894..4758660.433 rows=48199165 loops=1)"
"        Sort Key: tlid"
"        ->  Seq Scan on completechain  (cost=0.00..2228584.04
rows=48071704 width=8) (actual time=49.518..805234.970 rows=48199165
loops=1)"
"Total runtime: 5279988.127 ms"

Thus the time decreased from 8486 seconds to 5279 seconds - which is a nice improvement. However, that still leaves postgresql about 9 times slower.

I tried increasing work_mem up to 500000, but at that point the machine started using its swap partition and performance degraded back to the original values.

Charlie


Richard Huxton wrote:
 > Charlie Savage wrote:
 >> Hi everyone,
 >>
 >> I have a question about the performance of sort.
 >
 >> Note it takes over 10 times longer to do the sort than the full
 >> sequential scan.
 >>
 >> Should I expect results like this?  I realize that the computer is
 >> quite low-end and is very IO bound for this query, but I'm still
 >> surprised that the sort operation takes so long.
 >
 > The sort will be spilling to disk, which will grind your I/O to a halt.
 >
 >> work_mem =  16384                        # in Kb
 >
 > Try upping this. You should be able to issue "set work_mem = 100000"
 > before running your query IIRC. That should let PG do its sorting in
 > larger chunks.
 >
 > Also, if your most common access pattern is ordered via tlid look into
 > clustering the table on that.



Richard Huxton wrote:

Charlie Savage wrote:

Hi everyone,

I have a question about the performance of sort.


Note it takes over 10 times longer to do the sort than the full sequential scan.

Should I expect results like this? I realize that the computer is quite low-end and is very IO bound for this query, but I'm still surprised that the sort operation takes so long.


The sort will be spilling to disk, which will grind your I/O to a halt.

work_mem =  16384                        # in Kb


Try upping this. You should be able to issue "set work_mem = 100000" before running your query IIRC. That should let PG do its sorting in larger chunks.

Also, if your most common access pattern is ordered via tlid look into clustering the table on that.


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux