1) What is the ratio of file reads to writes/creates for your Java application? If this is very high (say 100:1 or more) GFS may work just fine. In our experience we have the most trouble with write contention, esp. on shared directories. 2) How much time elapses (statistically speaking) between consective reads of the same file on the same node? If this is low enough you may be able to tune demote_secs such that glocks can be reused for file accesses. If you have too many files to cache the inodes or glocks in memory, you may be better off tuning demote_secs and glock_purge to keep the numbers small, and accept the overhead that each file access is going to have to obtain a lock. 3) What does your directory layout look like? How many files are you placing in the same directory? You'll probably want to avoid very large directories. If e.g. all files are kept in a single directory, you'll get write contention that would effectively limit file creates to a single node at a time. For directories with a high percentage of file creates, we've had better luck establishing one directory per node, such that each node can read files created by others, but only write to their own directory. (And session affinity to reduce the frequency of cross-node reads.) Good luck. The above advice is based on empirical evidence from our own performance testing and other net wisdom, and the positive results we obtained from strategies we employed both within our application and via gfs tuning. (The experts can tell you if I got any of this right or wrong, since I lack an in-depth understanding of GFS/DLM internals. GFS2 may behave very differently; we haven't had a chance to try it yet.) Jeff -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Geoff Galitz Sent: Monday, December 01, 2008 4:30 AM To: linux-cluster@xxxxxxxxxx Subject: GFS on Centos We are investigating deploying GFS across a small pool of servers: Centos 5.1 x86_64 GigE Networking The data will consist of approximately 400GB of small JPG files accessed by an inhouse java app. The entire cluster is 50 machines but only 7 will require access to this data repository. GFS2 is not ready, yet... but my main question is, is it worth it to wait for GFS2? We are also looking at glusterfs. Our goal is: - low administrative (sysadm) overhead - good performance when accessing lots of small files (<100Mb) Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster