[snip]
Another strange note is that kswapd would be using 99 percent CPU during
these NFS storms, not sure why, since I wasn't swapping. All of this was
happening while I had plenty of disk IO left on the storage box.
What kernel version? Is is stock or distro? How much memory do you have in
the nfs server?
This is a vanilla 2.4.22 with an updated ips.c (ServeRAID driver). NFS server has 2.5GB RAM. Distro is RH9. Here is a quick glimpse of what the box looked like in this borked state: (System is hyperthreading)
133 processes: 131 sleeping, 2 running, 0 zombie, 0 stopped
CPU0 states: 0.0% user 49.1% system 0.0% nice 0.0% iowait 50.3% idle
CPU1 states: 0.1% user 73.4% system 0.0% nice 0.0% iowait 25.4% idle
CPU2 states: 0.2% user 69.2% system 0.0% nice 0.0% iowait 30.0% idle
CPU3 states: 0.0% user 42.3% system 0.0% nice 0.0% iowait 57.1% idle
CPU4 states: 0.0% user 61.4% system 0.0% nice 0.0% iowait 38.0% idle
CPU5 states: 0.0% user 46.0% system 0.0% nice 0.0% iowait 53.4% idle
CPU6 states: 0.3% user 50.4% system 0.0% nice 0.0% iowait 48.2% idle
CPU7 states: 0.0% user 57.2% system 0.0% nice 0.0% iowait 42.2% idle
Mem: 2587108k av, 2582708k used, 4400k free, 0k shrd, 668348k buff
675236k active, 1689620k inactive
Swap: 10490360k av, 5876k used, 10484484k free 1676088k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 11 root 14 0 0 0 0 RW 99.9 0.0 6583m 3 kswapd 12159 root 11 0 644 644 400 D 58.5 0.0 0:02 6 find 30480 root 9 0 0 0 0 DW 15.8 0.0 8:21 1 nfsd 30493 root 9 0 0 0 0 DW 14.6 0.0 9:10 1 nfsd 30473 root 9 0 0 0 0 DW 13.6 0.0 8:48 4 nfsd 30476 root 9 0 0 0 0 DW 12.6 0.0 8:43 6 nfsd 30490 root 9 0 0 0 0 DW 12.0 0.0 8:46 4 nfsd 30492 root 9 0 0 0 0 DW 11.6 0.0 8:40 7 nfsd 30479 root 9 0 0 0 0 SW 11.4 0.0 8:31 3 nfsd 9332 root 9 0 0 0 0 SW 11.2 0.0 2112m 1 kjournald 30491 root 9 0 0 0 0 DW 11.2 0.0 8:46 4 nfsd 30494 root 9 0 0 0 0 SW 11.0 0.0 8:24 7 nfsd 30503 root 9 0 0 0 0 SW 11.0 0.0 8:50 3 nfsd 30500 root 9 0 0 0 0 SW 10.2 0.0 8:19 4 nfsd 30485 root 9 0 0 0 0 SW 9.4 0.0 9:22 5 nfsd 30484 root 9 0 0 0 0 DW 9.0 0.0 9:04 0 nfsd 30482 root 9 0 0 0 0 DW 8.8 0.0 8:46 6 nfsd 30487 root 9 0 0 0 0 SW 8.8 0.0 8:56 4 nfsd 30488 root 9 0 0 0 0 DW 7.7 0.0 9:00 5 nfsd 30504 root 9 0 0 0 0 SW 7.1 0.0 9:01 4 nfsd 30489 root 9 0 0 0 0 DW 6.7 0.0 8:47 0 nfsd 30502 root 9 0 0 0 0 DW 6.5 0.0 8:56 0 nfsd 30474 root 9 0 0 0 0 DW 5.9 0.0 8:37 2 nfsd 30499 root 9 0 0 0 0 DW 5.3 0.0 8:47 7 nfsd 30472 root 9 0 0 0 0 DW 5.1 0.0 9:05 1 nfsd
The kswapd is hammered for some reason.
No imap just pop, smtp, and web if you count sqwebmail.
Eventually we just started to nuke old mail out of the larger dirs to get
them down to a sane size and things have cleared up.
Sounds like all of the waiting nfs clients waiting for directory processing to respond were helping to cause a VM imbalance (which is why I ask about your kernel version).
Now from what it sounds like htree will actually make things worse in
this type of situation, is this correct? Is there a patch somewhere
or a filesystem out there that is good at doing this stat and list
type of load. Or is it just NetApp time? :)
So the pop & imap servers are on freeBSD? Theo, will this preload work with
FreeBSD's libc?
I'd suggest you put the preload on your pop & imap servers and use htree on your nfs server.
What is this preload? Like the shared library thing? Wouldn't htree actually make things worse it makes dirs stat and list slower?
Steve
xfs, jfs, reiserfs, and ext3 with htree all use indexed directories, but I
don't know any details about xfs or jfs as to how they relate with mailDir.
Reiserfs has the same issues as current htree (without the reordering patch)
IIRC.
Mike
_______________________________________________ Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users