Re: Ext3/ext4 in a clustered environement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nicolas Ross wrote:

Get me right, there are millions of files, but no more than a few hundreds per directory. They are spread out splited on the database id, 2 caracters at a time. So a file name 1234567.jpg would end up in a directory 12/34/5/, or something similar.

OK, the way you wrote it looked like flat directory spacing.

We see appreciable knee points in GFS directory performance at 512, 4096 and 16384 files/directory, with progressively worse performance deterioration between each knee pair. (It's a 2^n type problem)

Yes it is a GFS specific, our backup server is on ext3 and rsyncing can be made in a couple of hours, without eating cpu at all (only memory), and without bringing the server on it's knees.

Have you tuned dentry/inode hashes? Have you got enough memory?

Bear in mind that rsync has to (at least) stat() every single file it looks at, which causes multicast locking traffic between the nodes if the FS is mounted on multiple machines - even mounted on a single node, it's slow.

If you can remount the FS with localflock then you'll see performance akin to your ext3 results, but on a single node mount with appropriate network/memory tuning you can at least double the rsync speed over vanilla configuration if there are a few million files involved.

We've experienced numerous cases where the filesystem hangs after a
service migration due a node (or service) failover. These hangs all
seem to be related to quota or NFS issues, so this may not be an issue
in your environment.

While we do not use nfs on top of the 3 most important directories, it will be used on some of those volumes...

nfs(v2,3) is old, crufty, non-cluster/multitask aware(*), doesn't play nice with anything else accessing the disk and seems to be the root cause of most of our stability problems.

I can't talk about pNFS (NFSv4) stability as that requires bind mounts which aren't supported in a failover environment - it seems to work on individual nodes but I've never managed to have it working properly on a cluster.

(*) BEWARE if you have multiple services with NFS exports in them, the exportfs commands can play a nasty race game and scribble over the export list in an unpredictable manner. We fixed this with flocking in nfsclient.sh but redhat haven't rolled it into their distribution yet.

=> would be failed and need to be manually restarted. What would be the
=> consequence if the filesystem happens to be mounted on 2 nodes ?

Most likely, filesystem corruption.

Other responses led me to beleive that if I let the cluster manage the filesystem, and never mount it myselef, it's much less likely to happen.

Correct... But human factors being what they are added with other possibilities (such as failure to unmount, etc) mean that the chance is significantly higher than zero for my liking on any important FS




--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster


[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux