On Fri, 2003-12-19 at 01:04, Theodore Ts'o wrote: > On Thu, Dec 18, 2003 at 01:36:25PM +1100, Adam Cassar wrote: > > What's your take on the nfs client load issues? It does run for 4-5 > > hours albeit at higher load (how explained by your post) however it does > > eventually die with the load going stupid (180 odd). It seems that the > > patch still has some nfs interoperability problems. > > Was this on the nfs *client* or the nfs *server*? This was on the client which is a MP box. The server was still responsive (with a higher load however) and I could still cat files on the filesystem. There is also the separate issue of the filesystem corruption - and e2fsck reporting everything was aok (e2fsprogs v1.34). > I'd really, really like to see a ps listing on the machine involved; > the output of "ps alxww" and "ps auxww" would be useful. The question > is what processes are hung in wait, and what they're waiting on.... It's going to be a little difficult for me to do that as I am going away on vacation for a month starting from today. I should have really tried to gather more state information before posting - sorry. When I get back I can play with it, it should be too difficult to duplicate in a test set up. > It would also be interesting to see if the LD_PRELOAD hack which I > sent you helped alleviate the load on the server? With the LD_PRELOAD > hack, the access pattern on stat's and open's should be restored to > the original workload, so if that makes the problem go away, then > the problem was merely that NFS doesn't degrade gracefully under load. I am happy to give that a go as well but it is going to have to wait unfortunately. > I believe, although I am not sure, that there are some NFS > improvements that went into 2.6 that didn't get back-ported to 2.4. > So it might be that running 2.6.0 on the clients and/or servers might > actually help. That would be a pretty daring move, though.... So was trying the htree patch :) > Finally, can you give me a little bit more detail of exactly what is > running on the clients and server, and the rationale of why you are > trying to apparently run incoming mail processes over NFS? (Is that > what you're doing? If so, it sounds rather scary...) The clients generally run exim and courier-imap/courier-pop. Mail is delivered and read over NFS via the maildir format which is NFS safe. As maildir stores each email in a single file I imagined htree would improve performance on the servers. What appears to be the issue is that the performance just degrades over time to an unusable state. I only really noticed the issue on the pop server as the load on the machine was ridiculous and mail was not retrievable. _______________________________________________ Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users