While investigating erratic performance on one our our servers, I'm getting some very odd performance stats coming from vmstat. What initially appeared to be happening is the machine goes into a hard loop in some mod_perl webserver code. Now there still may be an issue with the code, but my code examinations show no possible way this could be happening, but what I'm writing to you about is kjournald. When this apparently looping state is reached, kjournald shows D state for several seconds. During that same time, block-out "bo" in vmstat consistently shows 0. kjournald is choking. Or something. Also, this appears to get a lot of processes looping on write somewhere in the webserver. Looping on the lockfile, maybe? The runqueue gets huge > 40 often , and apache spirals up to all 75 processes in use. These pauses seem to occur at pretty exact intervals. Something like every 240 seconds. With the disk in question mounted as ext2 I don't get the issue. At all. The disk is IDE, but fairly fast, and we're using ext3 on rh 7.2 With kjournald waiting on disk, so is everything else. Since its a somewhat busy webserver with 50% of its hits involving a log write, a mysql query and an update, this means that disk is under nearly constant write load. By busy I mean 20 to 30 hits per second. I'd like to find a way to tell kjournald to wait less time with smaller buffers so that the big pause while kjournald flushes and subsequent long runqueue and its associated context switch overhead does not occur. Configuration data follows [root@arsozah root]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 1.60GHz Linux arsozah.userfriendly.org 2.4.18-10 #1 Wed Aug 7 11:39:21 EDT 2002 i686 unknown root@arsozah root]# hdparm -Tt /dev/hda /dev/hda: Timing buffer-cache reads: 128 MB in 0.49 seconds =261.22 MB/sec Timing buffered disk reads: 64 MB in 3.12 seconds = 20.51 MB/sec [root@arsozah root]# hdparm -iv /dev/hda /dev/hda: multcount = 16 (on) I/O support = 1 (32-bit) unmaskirq = 1 (on) using_dma = 1 (on) keepsettings = 0 (off) nowerr = 0 (off) readonly = 0 (off) readahead = 8 (on) geometry = 2495/255/63, sectors = 40088160, start = 0 Model=IBM-DPTA-372050, FwRev=P76OA30A, SerialNo=JMYJMFJ4664 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=34 BuffType=DualPortCache, BuffSize=1961kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=-66060037, LBA=yes, LBAsects=40088160 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 *udma4 AdvancedPM=no Drive Supports : ATA/ATAPI-4 T13 1153D revision 17 : ATA-1 ATA-2 ATA-3 ATA-4 DMA modes: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 4:15pm up 14 days, 23:58, 2 users, load average: 10.72, 7.41, 6.20 vmstat 5 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 20 0 1 23300 9464 62276 109708 0 0 0 23 296 850 47 12 41 12 0 1 23300 9480 62276 109696 0 0 0 323 306 362 32 12 56 20 0 1 23300 9468 62276 109696 0 0 0 29 312 1017 48 13 39 14 0 1 23300 9508 62276 109664 0 0 4 321 272 343 36 12 52 31 0 1 23300 9508 62276 109664 0 0 0 35 310 369 37 12 52 17 0 1 23300 9508 62276 109664 0 0 0 248 318 732 44 12 44 19 0 1 23300 9508 62276 109664 0 0 0 34 260 320 33 12 55 27 0 1 23300 9504 62276 109664 0 0 0 249 253 338 35 11 54 22 0 1 23300 9500 62276 109664 0 0 0 35 240 456 37 12 51 39 0 3 23300 9488 62276 109664 0 0 0 2 194 3626 84 16 0 57 0 4 23300 9408 62276 109664 0 0 0 0 198 3372 85 15 0 40 0 2 23300 9404 62272 109668 0 0 0 0 184 3946 81 19 0 23 0 1 23300 9336 62272 109564 0 0 0 36 259 2773 85 15 0 23 0 3 23300 9392 62272 109344 0 0 0 6 415 1784 85 15 0 7 0 3 23300 10288 62272 108760 0 0 5 486 291 2001 63 14 23 15 0 3 23300 10260 62268 108764 0 0 0 33 304 604 41 12 48 9 0 0 23300 10272 62268 108764 0 0 0 348 255 418 35 14 51 25 0 1 23300 10268 62264 108768 0 0 0 62 218 1140 40 13 47 and a top capture during the above "0 idle time" 335 processes: 292 sleeping, 43 running, 0 zombie, 0 stopped CPU states: 80.8% user, 19.1% system, 0.0% nice, 0.0% idle Mem: 772896K av, 763488K used, 9408K free, 0K shrd, 62272K buff Swap: 1052248K av, 23300K used, 1028948K free 109668K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 4262 root 8 -10 1236 1236 832 R < 11.1 0.1 0:54 top 3343 apache 16 0 17616 17M 11608 R 3.3 2.2 0:59 httpd 3238 apache 16 0 17644 17M 11608 R 2.6 2.2 1:08 httpd 3245 apache 16 0 17640 17M 11604 R 2.4 2.2 1:08 httpd 3298 apache 16 0 17636 17M 11608 R 2.3 2.2 1:03 httpd 3535 apache 16 0 16756 16M 11700 R 2.3 2.1 0:46 httpd 3151 apache 15 0 17644 17M 11588 R 1.7 2.2 1:16 httpd 3373 apache 16 0 16772 16M 11704 R 1.7 2.1 0:57 httpd 3174 apache 16 0 17632 17M 11608 R 1.5 2.2 1:12 httpd 3214 mysql 16 0 84056 64M 2652 R 1.4 8.5 0:07 mysqld-max 3479 apache 15 0 17608 17M 11612 R 1.4 2.2 0:51 httpd 3152 mysql 15 0 84056 64M 2652 R 1.0 8.5 0:08 mysqld-max 3215 apache 16 0 17636 17M 11588 R 1.0 2.2 1:08 httpd 3477 mysql 16 0 84056 64M 2652 R 1.0 8.5 0:06 mysqld-max 3531 apache 16 0 17600 17M 11644 R 0.8 2.2 0:47 httpd 3534 mysql 17 0 84056 64M 2652 R 0.8 8.5 0:05 mysqld-max 3347 apache 15 0 17632 17M 11596 R 0.7 2.2 0:59 httpd 3579 mysql 15 0 84056 64M 2652 R 0.5 8.5 0:05 mysqld-max 4257 root 15 0 480 480 404 R 0.3 0.0 0:01 vmstat 3438 mysql 15 0 84056 64M 2652 R 0.1 8.5 0:05 mysqld-max 1 root 15 0 456 416 400 R 0.0 0.0 0:06 init 12 root 0 -20 0 0 0 DW< 0.0 0.0 2:47 kjournald 28478 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:06 mysqld-max 29335 root 15 0 13812 13M 11572 R 0.0 1.7 0:01 httpd 3171 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:08 mysqld-max 3217 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:08 mysqld-max 3242 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:07 mysqld-max 3262 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:07 mysqld-max 3264 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:07 mysqld-max 3289 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:07 mysqld-max 3292 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:07 mysqld-max 3299 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:07 mysqld-max 3367 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:06 mysqld-max 3370 mysql 16 0 84056 64M 2652 R 0.0 8.5 0:07 mysqld-max 3400 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:02 mysqld-max 3412 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:06 mysqld-max 3414 mysql 16 0 84056 64M 2652 R 0.0 8.5 0:06 mysqld-max 3435 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:06 mysqld-max 3439 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:06 mysqld-max 3457 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:06 mysqld-max 3503 apache 15 0 16760 16M 11728 R 0.0 2.1 0:50 httpd 3530 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:05 mysqld-max 3577 mysql 17 0 84056 64M 2652 R 0.0 8.5 0:05 mysqld-max 3940 root 15 0 2012 1956 1532 R 0.0 0.2 0:00 sshd -- Jay "yohimbe" Thorne yohimbe@userfriendly.org Mgr Sys & Tech, Userfriendly.org _______________________________________________ Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users