Re: Ext3/ext4 in a clustered environement

"Nicolas Ross" <rossnick-lists@xxxxxxxxxxx> · Mon, 7 Nov 2011 16:00:27 -0500

I've seen significant a performance drop with ext3 (and other) filesystems
with 10s to 100s of thousands of files per directory. Make sure that the
"directory hash" option is enabled for ext3. With ~1M files per directory, 
I'd
do some performance tests comparing rsync under ext3, ext4, and gfs befor
changing filesystems...while ext3/4 do perform better than gfs, the 
directory
size may be such an overwhelming factor that the filesystem choice is
irrelevent.

Get me right, there are millions of files, but no more than a few hundreds 
per directory. They are spread out splited on the database id, 2 caracters 
at a time. So a file name 1234567.jpg would end up in a directory 12/34/5/, 
or something similar.

Is this a GFS issue strictly, or an issue with rsync. Have you set up a
similar environment under ext3/4 to test jus the rsync part? Rsync is
known for being a memory & resource hog, particularly at the initial
stage of  building the filesystem tree.

I would strongly recommend benchmarking rsync on ext3/4 before making the
switch.

One option would be to do several 'rsync' operations (serially, not in
parallel!), each operating on a subset of the filesystem, while continuing
to use gfs.

Yes it is a GFS specific, our backup server is on ext3 and rsyncing can be 
made in a couple of hours, without eating cpu at all (only memory), and 
without bringing the server on it's knees.

Spliting on subdirectories might be an option, but that would be more like a 
band-aid... I'll try to avoid that.

=> <fs device="/dev/VGx/documentsA" force_unmount="1" fstype="ext4"
=> mountpoint="/GFSVolume1/Service1/documentsA" name="documentsA"
=> options="noatime,quota=off"/>
=>
=> So, first, is this doable ?

Yes.

We have been doing something very similar for the past ~2 years, except
not mounting the ext3/4 partition under a GFS mountpoint.

I will be doing some expiriment with that...

=>
=> Second, is this risky ? In the sens of that with force_unmont true, I 
assume
=> that no other node would mount that filesystem before it is unmounted 
on the
=> stopping service. I know that for some reason umount could hang, but 
it's
=> not likely since this data is mostly read-only. In that case the 
service

We've experienced numerous cases where the filesystem hangs after a
service migration due a node (or service) failover. These hangs all
seem to be related to quota or NFS issues, so this may not be an issue
in your environment.

While we do not use nfs on top of the 3 most important directories, it will 
be used on some of those volumes...

=> would be failed and need to be manually restarted. What would be the
=> consequence if the filesystem happens to be mounted on 2 nodes ?

Most likely, filesystem corruption.

Other responses led me to beleive that if I let the cluster manage the 
filesystem, and never mount it myselef, it's much less likely to happen.

Thanks 

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster