RE: Directories with >100K files

Steven Whitehouse <swhiteho@xxxxxxxxxx> · Wed, 21 Jan 2009 10:10:07 +0000

Hi,

On Tue, 2009-01-20 at 22:32 -0500, Jeff Sturm wrote:
> > -----Original Message-----
> > From: linux-cluster-bounces@xxxxxxxxxx 
> > [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of 
> > nick@xxxxxxxxxxxxxxx
> > Sent: Tuesday, January 20, 2009 5:19 AM
> > To: linux-cluster@xxxxxxxxxx
> > Subject:  Directories with >100K files
> > 
> > We have a GFS filesystem mounted over iSCSI. When doing an 
> > 'ls' on directories with several thousand files it takes 
> > around 10 minutes to get a response back -
> 
> You don't say how many nodes you have, or anything about your
> networking.
> 
> Some general pointers:
> 
> - A plain "ls" is probably much faster any variant that fetches inode
> metatdata, e.g. "ls -l".  The latter performs a stat() on each
> individual file which in turn triggers locking activity of some sort.
> This is known to be slow on GFS1.  (I've heard reports that GFS2 is/will
> be better.)
> 
The latest gfs1 is also much better. It is a tricky thing to do
efficiently, and not doing the stats is a good plan.

> - You want a fast, reliable low-latency network for your cluster.  Intel
> GigE cards and a fast switch are a good bet.
> 
> - Unless your application needs access times or quota support, mounting
> with "noquota,noatime" is a good idea.  Maybe also "nodiratime".
> 
> > Can anyone recommend any GFS tunables to help us out here ?
> 
> You could try bumping demote_secs up from its default of 5 minutes.
> That'll cause locks to be held longer so they may not need to be
> reacquired so often.  It won't help with the initial directory listing,
> but should help on subsequent invocations.
> 
> In your case, with "ls" taking 8 minutes to run, some locks initially
> acuired during execution of the command have already been demoted once
> complete.
> 
Also the question to ask is how many nodes are accessing this
filesystem? If more than one node is accessing the same directory and at
least one of those does a write (i.e. inode create/delete) within the
demote_secs time, then the demote_secs time will not make much
difference since the locks will be pushed out by the other node's access
anyway.

> > Should we set statfs_fast to 1 ?
> 
> Probably good to set this, regardless.
> 
> > What about glock_purge ?
> 
> Glock_purge helps limit CPU time consumed by gfs_scand when a large
> number of unused glocks are present.  See
> http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_glock_trimming.R4
> .  This may make your system run better but I'm not sure it's going to
> help with listing your giant directories.
> 
Better to disable this altogether unless there is a very good reason to
use it. It generally has the effect of pushing things out of cache early
so is to be avoided.

> > Here is the fstab entry for the GFS filesystem:
> > /dev/vggfs/lvol00       /apps                   gfs     
> > _netdev         1 2
> 
> Try "noatime,noquota" here.
> 
> Jeff
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster

Steve.

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster