Hi all,
I am still encountering this issue.
I am doing further troubleshooting.
Here is what I found:
When I do: dtrace -s /usr/demo/dtrace/whoio.d
I found that there's one process that is doing majority of i/o, but that
process is not listed on pg_stat_activity.
I am also seeing more of this type of query being slow:
EXECUTE <unnamed> [PREPARE: ...
I am also seeing some article recommending adding some entries on
/etc/system:
segmapsize=2684354560 set ufs:freebehind=0
I haven't tried this, I am wondering if this will help.
Also, here is the output of iostat -xcznmP 1 at approx time during the
i/o spike:
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
4.0 213.0 32.0 2089.9 0.0 17.0 0.0 78.5 0 61 c1t0d0s6 (/usr)
cpu
us sy wt id
54 6 0 40
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 0.9 0.0 0.0 0 90 c1t0d0s1 (/var)
2.0 335.0 16.0 3341.6 0.2 73.3 0.6 217.4 4 100 c1t0d0s6 (/usr)
cpu
us sy wt id
30 4 0 66
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 1.0 0.0 4.0 0.0 0.1 0.0 102.0 0 10 c1t0d0s1 (/var)
1.0 267.0 8.0 2729.1 0.0 117.8 0.0 439.5 0 100 c1t0d0s6
(/usr)
cpu
us sy wt id
28 8 0 64
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.0 270.0 8.0 2589.0 0.0 62.0 0.0 228.7 0 100 c1t0d0s6 (/usr)
cpu
us sy wt id
26 2 0 72
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
2.0 269.0 16.0 2971.5 0.0 66.6 0.0 245.7 0 100 c1t0d0s6 (/usr)
cpu
us sy wt id
8 7 0 86
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.0 268.0 8.0 2343.5 0.0 110.3 0.0 410.2 0 100 c1t0d0s6
(/usr)
cpu
us sy wt id
4 4 0 92
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 260.0 0.0 2494.5 0.0 63.5 0.0 244.2 0 100 c1t0d0s6 (/usr)
cpu
us sy wt id
24 3 0 74
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.0 286.0 8.0 2519.1 35.4 196.5 123.3 684.7 49 100 c1t0d0s6
(/usr)
cpu
us sy wt id
65 4 0 30
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
2.0 316.0 16.0 2913.8 0.0 117.2 0.0 368.7 0 100 c1t0d0s6
(/usr)
cpu
us sy wt id
84 7 0 9
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
5.0 263.0 40.0 2406.1 0.0 55.8 0.0 208.1 0 100 c1t0d0s6 (/usr)
cpu
us sy wt id
77 4 0 20
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
4.0 286.0 32.0 2750.6 0.0 75.0 0.0 258.5 0 100 c1t0d0s6 (/usr)
cpu
us sy wt id
21 3 0 77
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
2.0 273.0 16.0 2516.4 0.0 90.8 0.0 330.0 0 100 c1t0d0s6 (/usr)
cpu
us sy wt id
15 6 0 78
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
2.0 280.0 16.0 2711.6 0.0 65.6 0.0 232.6 0 100 c1t0d0s6 (/usr)
cpu
us sy wt id
6 3 0 92
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.0 308.0 8.0 2661.5 61.0 220.2 197.4 712.7 67 100 c1t0d0s6
(/usr)
cpu
us sy wt id
7 4 0 90
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.0 268.0 8.0 2839.9 0.0 97.1 0.0 360.9 0 100 c1t0d0s6
(/usr)
cpu
us sy wt id
11 10 0 80
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 309.0 0.0 3333.5 175.2 208.9 566.9 676.2 81 99 c1t0d0s6
(/usr)
cpu
us sy wt id
0 0 0 100
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 330.0 0.0 2704.0 145.6 256.0 441.1 775.7 100 100 c1t0d0s6
(/usr)
cpu
us sy wt id
4 2 0 94
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 311.0 0.0 2543.9 151.0 256.0 485.6 823.2 100 100 c1t0d0s6
(/usr)
cpu
us sy wt id
2 0 0 98
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 319.0 0.0 2576.0 147.4 256.0 462.0 802.5 100 100 c1t0d0s6
(/usr)
cpu
us sy wt id
0 1 0 98
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 2 13 c1t0d0s1 (/var)
0.0 366.0 0.0 3088.0 124.4 255.8 339.9 698.8 100 100 c1t0d0s6
(/usr)
cpu
us sy wt id
6 5 0 90
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 2.0 0.0 16.0 0.0 1.1 0.0 533.2 0 54 c1t0d0s1 (/var)
1.0 282.0 8.0 2849.0 1.5 129.2 5.2 456.5 10 100 c1t0d0s6
(/usr)
Thank you in advance for your help!
Jun
On 8/30/06, *Junaili Lie* <junaili@xxxxxxxxx <mailto:junaili@xxxxxxxxx>>
wrote:
I have tried this to no avail.
I have also tried changing the bg_writer_delay parameter to 10. The
spike in i/o still occurs although not in a consistent basis and it
is only happening for a few seconds.
On 8/30/06, *Jignesh K. Shah* <J.K.Shah@xxxxxxx
<mailto:J.K.Shah@xxxxxxx> > wrote:
The bgwriter parameters changed in 8.1
Try
bgwriter_lru_maxpages=0
bgwriter_lru_percent=0
to turn off bgwriter and see if there is any change.
-Jignesh
Junaili Lie wrote:
> Hi Jignesh,
> Thank you for my reply.
> I have the setting just like what you described:
>
> wal_sync_method = fsync
> wal_buffers = 128
> checkpoint_segments = 128
> bgwriter_all_percent = 0
> bgwriter_maxpages = 0
>
>
> I ran the dtrace script and found the following:
> During the i/o busy time, there are postgres processes that
has very
> high BYTES count. During that non i/o busy time, this same process
> doesn't do a lot of i/o activity. I checked the
pg_stat_activity but
> couldn't found this process. Doing ps revealed that this
process is
> started at the same time since the postgres started, which
leads me to
> believe that it maybe background writer or some other internal
process.
> This process are not autovacuum because it doesn't disappear
when I
> tried turning autovacuum off.
> Except for the ones mentioned above, I didn't modify the other
> background setting:
> MONSOON=# show bgwriter_delay ;
> bgwriter_delay
> ----------------
> 200
> (1 row)
>
> MONSOON=# show bgwriter_lru_maxpages ; bgwriter_lru_maxpages
> -----------------------
> 5
> (1 row)
>
> MONSOON=# show bgwriter_lru_percent ;
> bgwriter_lru_percent
> ----------------------
> 1
> (1 row)
>
> This i/o spike only happens at minute 1 and minute 6 (ie.
10.51, 10.56 )
> . If I do select * from pg_stat_activity during this time, I
will see a
> lot of write queries waiting to be processed. After a few seconds,
> everything seems to be gone. All writes that are not happening
at the
> time of this i/o jump are being processed very fast, thus do
not show on
> pg_stat_activity.
>
> Thanks in advance for the reply,
> Best,
>
> J
>
> On 8/29/06, *Jignesh K. Shah* < J.K.Shah@xxxxxxx
<mailto:J.K.Shah@xxxxxxx>
> <mailto: J.K.Shah@xxxxxxx <mailto:J.K.Shah@xxxxxxx>>> wrote:
>
> Also to answer your real question:
>
> DTrace On Solaris 10:
>
> # dtrace -s /usr/demo/dtrace/whoio.d
>
> It will tell you the pids doing the io activity and on
which devices.
> There are more scripts in that directory like iosnoop.d,
iotime.d
> and others which also will give
> other details like file accessed, time it took for the io etc.
>
> Hope this helps.
>
> Regards,
> Jignesh
>
>
> Junaili Lie wrote:
> > Hi everyone,
> > We have a postgresql 8.1 installed on Solaris 10. It is
running fine.
> > However, for the past couple days, we have seen the i/o
reports
> > indicating that the i/o is busy most of the time. Before
this, we
> only
> > saw i/o being busy occasionally (very rare). So far,
there has
> been no
> > performance complaints by customers, and the slow query
reports
> doesn't
> > indicate anything out of the ordinary.
> > There's no code changes on the applications layer and no
database
> > configuration changes.
> > I am wondering if there's a tool out there on Solaris to
tell which
> > process is doing most of the i/o activity?
> > Thank you in advance.
> >
> > J
> >
>
>