Re: htree stabilitity and performance issues

Mike Fedyk <mfedyk@xxxxxxxxxxxxx> · Fri, 19 Dec 2003 16:38:32 -0800

On Fri, Dec 19, 2003 at 01:43:11PM -0700, kwijibo@xxxxxxxxxx wrote:
> Mike Fedyk wrote:
> 
> [snip]
> 
> >>Another strange note is that kswapd would be using 99 percent CPU during
> >>these NFS storms, not sure why, since I wasn't swapping.  All of this was
> >>happening while I had plenty of disk IO left on the storage box. 
> >>   
> >>
> >
> >What kernel version?  Is is stock or distro?  How much memory do you have 
> >in
> >the nfs server?
> > 
> >
> This is a vanilla 2.4.22 with an updated ips.c (ServeRAID driver).
> NFS server has 2.5GB RAM.  Distro is RH9.  Here is a quick glimpse
> of what the box looked like in this borked state: (System is hyperthreading)

You have a lot of processors, and a relatively large amount of memory, so
I'd first suggest you try 2.4.23, as it has some of the -aa VM patches
merged in that release.

If that doesn't fix the problems you're seeing in respect to kswapd, then
try -aa.  It is meant for larger boxes like yours.  The -aa tree also has
several nfs updates, and is in maintenance mode, much like the vanilla 2.4
kernel, but geared twards a different target (higher end systems).

> >I'd suggest you put the preload on your pop & imap servers and use htree on
> >your nfs server.
> >
> What is this preload?  Like the shared library thing?  Wouldn't
> htree actually make things worse it makes dirs stat and list
> slower?

Let me be more specific as to what is happening with htree...  The
individual stat calls will be faster with htree as they pass through the
kernel since with the larger directories it will be indexed instead of
having to search through on average half of the list to find the entry you
want to stat.

The slowdown occours when the stat calls hit the disk.  In the non-htree
case, when a directory entry is added it is added at the tail end of the
list.  Typically, that correlates with the order of the file layout on the
disk.  So when you read the directory, it can read the files in disk order,
and is relatively fast (once you have gone through the overhead of the linear
search through the directory, which is cpu bound).

With htree, when you read a directory, it returns the list in index order,
which has nothing to do with the order of the files on disk.  That causes
more seeking and overall slows the entire process (I've heard, haven't done
any testing myself) even though the overhead of the linear search is not
there anymore, it slows down once it hits the disks.

That is what the shared library does (being loaded with the LD_PRELOAD
mechanism) it sorts the directory in userspace before any stat calls can be
made where it is an order of magnatude simpler to do the sorting than in
kernel space.  (Even then there is work on a patch in progress for an
acceptable way to do some type of sorting in the kernel.)

(PS, if I've made any errors, please let me know.  I'm basically restating
what I understand from reading the lists.)

Mike

_______________________________________________

Ext3-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/ext3-users