Re: Re: Increasing work_mem and shared_buffers on Postgres 9.2 significantly slows down queries

Petr Praus <petr@xxxxxxxxx> · Fri, 2 Nov 2012 11:12:22 -0500

On 1 November 2012 18:25, Gunnar "Nick" Bluth <gunnar.bluth@xxxxxxxxxxx> wrote:

    Am 01.11.2012 21:40, schrieb Marcos
      Ortiz:

      Regards, Petr.

      Tuning PostgreSQL is not just change the postgresql.conf, it
      includes more things like:

      - the filesystem that you are using

      - the kernel version that you using (particularly in Linux
      systems)

      - the tuning to kernel variables 

      - the type of discs that you are using (SSDs are very fast, like
      you saw in your iMac system)

      On 10/30/2012 02:44 PM, Petr Praus
        wrote:

      I just found one particularly interesting fact: when
        I perform the same test on my mid-2010 iMac (OSX 10.7.5) also
        with Postgres 9.2.1 and 16GB RAM, I don't experience the slow
        down.
        Specifically:
        set work_mem='1MB';
        select ...; // running time is ~1800 ms
        set work_mem='96MB';
        select ...' // running time is ~1500 ms

        When I do exactly the same query (the one from my previous
          post) with exactly the same data on the server:
        I get 2100 ms with work_mem=1MB and 3200 ms with 96 MB.

    Just some thoughts (interested in this, once seen a Sybase ASE come
    close to a halt when we threw a huge lot of SHM at it...).

    8 cores, so probably on 2 sockets? What CPU generation?

The processors are two quad core Intel x7350 Xeon at 2.93Ghz. It's somewhat older (released late 2007) but it's not absolute speed I'm after - it's the difference in speed when increasing work_mem.

    Both explain outputs show an amount of "read" buffers. Did you warm
    the caches before testing?

I did warm the caches before testing. 

    Maybe you're hitting a NUMA issue there? If those reads come from
    the OS' cache, the scheduler might decide to move your process to a
    different core (that can access the cache better), then moves it
    back when you access the SHM segment more (the ~4GB get allocated at
    startup, so probably "close" to the CPU the postmaster ist running
    on). A migration to a different cacheline is very expensive.

    The temp reads/writes (i.e., the OS cache for the temp files) would
    probably be allocated close to the CPU requesting the temp file.

    Just groping about in the dark though... but the iMac is obviously
    not affected by this, with one socket/memory channel/cache line.

I made a test with Ubuntu 12.04 VM machine (vmware workstation 4.1.3 on the same iMac) with 4GB memory and shared_buffers=1GB. To my slight surprise, the query is faster on Ubuntu VM machine then on the OSX (~1050ms vs. ~1500ms with work_mem=1MB). This might be caused by effective_io_concurrency which is enabled on Ubuntu but can't be enabled on OSX because postgres does not support it there. The interesting thing is that increasing work_mem to 96MB on Ubuntu slows down the query to about ~1250ms from ~1050ms.

    Might be worth to

    - manually pin (with taskset) the session you test this in to a
    particular CPU (once on each socket) to see if the times change

I tested this and it does not seem to have any effect (assuming I used taskset correctly but I think so: taskset 02 psql to pin down to CPU #1 and taskset 01 psql to pin to CPU #0).

    - try reducing work_mem in the session you're testing in (so you
    have large SHM, but small work mem)

Did this and it indicates to me that shared_buffers setting actually does not have an effect on this behaviour as I previously thought it has. It really boils down to work_mem: when I set shared_buffers to something large (say 4GB) and just play with work_mem the problem persists.

    Cheers,

    -- 
Gunnar "Nick" Bluth
RHCE/SCLA

Mobil   +49 172 8853339
Email: gunnar.bluth@xxxxxxxxxxx
__________________________________________________________________________
In 1984 mainstream users were choosing VMS over UNIX.  Ten years later
they are choosing Windows over UNIX.  What part of that message aren't you
getting? - Tom Payne