Wendy Cheng wrote:
Riaan van Niekerk wrote:
My question to you or anyone who is familiar with NFS on GFS, or GFS
in general, which of the following are still valid issues for the
current (6.1u4) version of GFS. If all or most of them still apply, I
can use this as motivation for my customer to strongly consider going
off NFS on GFS. Removing the NFS from our GFS cluster has been on the
cards for quite a while, but has not gained momentum due to lack of
information on the performance gains of such a move (very difficult to
gage) or the architectural problems/limitations of NFS on GFS (for
which the following extract is spot-on).
These have been worked on and some of them do have test patches ready to
address the issues. However, the changes are non-trivial and may involve
base kernel modifiction that we need to get upstream (community linux
kernel) acceptance. The efforts take time since we would like to do it
conservatively to preserve GFS1/2 stability. Unless the posted problems
have urgent needs (let us know), the current NFS-GFS development focus
is on failover (Red Hat bugzilla 132823).
Is performance the primary concern you have now ?
-- Wendy
Yes, mostly. We have a couple of open service requests for stability.
They are very intermittent and not reproduceable (and nothing in
bugzilla seems to match):
a) load average on nodes steadily climbs until load average reaches the
nfsd count, upon which all I/O hangs. We reboot nodes one by one, and as
soon as the one with a stuck lock is bounced, I/O returns to all nodes)
b) kernel oopses with Assertion failed on line 428 / 357 of dlm/lock.c
while there is no load on the system . this happens 3 days in a row,
over a weekend, and then for weeks, the error does not occur again.
getting the info that upport requires (sysrq t, lockdump, etc, on all
nodes, crashdump on failing node, is pretty difficult). We are not
married to NFS on GFS, even though it is a cost-effective interim step
for until we can get all our mail servers (14 in all) SAN-attached.
Can I read into "have been worked on" and "some do have test patches"
that these 4 issues still persist? I need the ammunition to motivate the
move away from NFS on GFS. this architecture document gives it to me if
these issues are still valid.
tnx
Riaan
begin:vcard
fn:Riaan van Niekerk
n:van Niekerk;Riaan
org:Obsidian Systems;Obsidian Red Hat Consulting
email;internet:riaan@xxxxxxxxxxxxxx
title:Systems Architect
tel;work:+27 11 792 6500
tel;fax:+27 11 792 6522
tel;cell:+27 82 921 8768
x-mozilla-html:FALSE
url:http://www.obsidian.co.za
version:2.1
end:vcard
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster