Re: Very poor read performance, query independent

Scott Marlowe <scott.marlowe@xxxxxxxxx> · Tue, 18 Jul 2017 19:08:14 -0600

On Tue, Jul 18, 2017 at 3:20 AM, Charles Nadeau
<charles.nadeau@xxxxxxxxx> wrote:
> Claudio,
>
> Find attached the iostat measured while redoing the query above
> (iostat1.txt). sda holds my temp directory (noop i/o scheduler), sdb the
> swap partition (cfq i/o scheduler) only and sdc (5 disks RAID0, noop i/o
> scheduler) holds the data. I didn't pay attention to the load caused by 12
> parallel scans as I thought the RAID card would be smart enough to
> re-arrange the read requests optimally regardless of the load. At one moment
> during the query, there is a write storm to the swap drive (a bit like this
> case:
> https://www.postgresql.org/message-id/AANLkTi%3Diw4fC2RgTxhw0aGpyXANhOT%3DXBnjLU1_v6PdA%40mail.gmail.com).

My experience from that case (and  few more) has led me to believe
that Linux database servers with plenty of memory should have their
swaps turned off. The Linux kernel works hard to swap out little used
memory to make more space for caching active data.

Problem is that whatever decides to swap stuff out gets stupid when
presented with 512GB RAM and starts swapping out things like sys v
shared_buffers etc.

Here's the thing, either your memory is big enough to buffer your
whole data set, so nothing should get swapped out to make room for
caching.

OR your dataset is much bigger than memory. In which case, making more
room gets very little if it comes at the cost of waiting for stuff you
need to get read back in.

Linux servers should also have zone reclaim turned off, and THP disabled.

Try running "sudo swapoff -a" and see if it gets rid of your swap storms.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance