Hi
We are running a 20 node cluster, using Scientific Linux 5.3, with a GFS
shared filesystem hosted on our SAN. Cluster nodes are dual core units
with 4 GB of RAM, and a standard Qlogic FC HBA.
Most of the 20 nodes form a batch-processing cluster, and our users are
happy enough with the performance they get, but some nodes are used
interactively. When the filesystem is under stress due to large batch
processing jobs running on other nodes, interactive use becomes very slow
and painful.
Is there any tuning I (the sysadmin) can do that might help in this
situation? Would a migration to gfs2 make a difference? Are all nodes
treated identically, or can hosts mounting the filesystem have any kind of
priority/QoS? Which tools could I use to track down any bottlenecks?
In theory we could update kernel+gfs bits to a later release, though we
saw the same issues when using the same cluster with a SL4.x stack, but
for now it's
kernel-2.6.18-128.1.1.el5.i686
kmod-gfs-0.1.31-3.el5.i686
gfs-utils-0.1.20-7.el5.i386
gfs2-utils-0.1.53-1.el5_3.1.i386
Thanks for any help/suggestions,
Kevin
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster