Re: [HACKERS] 8.3beta1 testing on Solaris

Bruce Momjian <bruce@xxxxxxxxxx> · Thu, 15 Nov 2007 15:49:27 -0500 (EST)

This has been saved for the 8.4 release:

	http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Jignesh K. Shah wrote:
> 
> I  changed  CLOG Buffers to 16
> 
> Running the test again:
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU     ID                    FUNCTION:NAME
>   0   1027                       :tick-5sec
> 
>   /export/home0/igen/pgdata/pg_clog/0024               
> -2753028219296                1
>   /export/home0/igen/pgdata/pg_clog/0025               
> -2753028211104                1
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU     ID                    FUNCTION:NAME
>   1   1027                       :tick-5sec
> 
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU     ID                    FUNCTION:NAME
>   1   1027                       :tick-5sec
> 
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU     ID                    FUNCTION:NAME
>   0   1027                       :tick-5sec
> 
>   /export/home0/igen/pgdata/pg_clog/0025               
> -2753028194720                1
> 
> 
> So Tom seems to be correct that it is a case of CLOG Buffer thrashing. 
> But since I saw the same problem with two different workloads, I think 
> people hitting this problem is pretty high.
> 
> Also I am bit surprised that CLogControlFile did not show up as being 
> hot.. Maybe because not much writes are going on .. Or maybe since I did 
> not trace all 500 users to see their hot lock status..
> 
> 
> Dmitri has another workload to test, I might try that out later on to 
> see if it causes similar impact or not.
> 
> Of course I havent seen my throughput go up yet since I am already CPU 
> bound... But this is good since the number of IOPS to the disk are 
> reduced (and hence system calls).
> 
> 
> If I take this as my baseline number.. I can then proceed to hunt other 
> bottlenecks????
> 
> 
> Whats the view of the community?
> 
> Hunt down CPU utilizations or Lock waits next?
> 
> Your votes are crucial on where I put my focus.
> 
> Another thing Josh B told me to check out was the wal_writer_delay setting:
> 
> I have done two settings with almost equal performance (with the CLOG 16 
> setting) .. One with 100ms and other default at 200ms.. Based on the 
> runs it seemed that the 100ms was slightly better than the default .. 
> (Plus the risk of loosing data is reduced from 600ms to 300ms)
> 
> Thanks.
> 
> Regards,
> Jignesh
> 
> 
> 
> 
> Tom Lane wrote:
> > "Jignesh K. Shah" <J.K.Shah@xxxxxxx> writes:
> >   
> >> So the ratio of reads vs writes to clog files is pretty huge..
> >>     
> >
> > It looks to me that the issue is simply one of not having quite enough
> > CLOG buffers.  Your first run shows 8 different pages being fetched and
> > the second shows 10.  Bearing in mind that we "pin" the latest CLOG page
> > into buffers, there are only NUM_CLOG_BUFFERS-1 buffers available for
> > older pages, so what we've got here is thrashing for the available
> > slots.
> >
> > Try increasing NUM_CLOG_BUFFERS to 16 and see how it affects this test.
> >
> > 			regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> >                http://www.postgresql.org/docs/faq
> >   
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org

-- 
  Bruce Momjian  <bruce@xxxxxxxxxx>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match