Fajar A. Nugraha wrote:
Hi Wendy,
Please help me go through this summary from the bugzilla
Before we complete the work, for NFS v2/V3, RHEL 4.4 has the following
restrictions:
==> Is this still valid for RHEL 4.5 and RHEL5?
NFS failover most likely will work, except the documented corner cases.
Our customers normally find the restrictions workable. One thing has to
be made clear is that these are all inherent linux kernel issues. RHCS
has been doing a good job to workaround a large portion of them.
Occasionally you'll find ESTABLE or EPERM though. The fixes didn't make
into RHEL 4.5 nor RHEL 5.
B-1: Unless NFS client applications can tolerate ESTALE and/or EPERM errors,
IO activities on the failover ip interface must be temporarily quiesced
until active-active failover transition completes. This is to avoid
non-idempotent NFS operation failure on the new server. (check out
"Why NFS Sucks" by Olaf Kirch, placed as "kirch-reprint.pdf" in 2006
OLS proceeding).
==> What does this mean, exactly? For example, does this mean that I
should not use RHCS-nfs-mounted storage for
busy-accessed-all-the-time-web-servers because I'd likely get
ESTALE/EPERM during failover?
NFS V2/V3 failover has been a difficult subject regardless which
platform you're on. Assume a flawless failover is a naive assumption.
NFS V4 (where NFS client is required to play a helping role) is
developed to remedy the issues.
B-2: With various possible base kernel bugs outside RHCS' control, there
are possibilities that local filesystem (such as ext3) umount could
fail. To ensure data integrity, RHCS will abort the failover. Admin
could specify the self-fence (reboot taken-over server) option
to force failover (via cluster.conf file).
==> In short, it'd be better using GFS, right?
GFS certainly works better in this arena.
B-3: If nfs client invokes NLM locking call, the subject nfs servers (both
taken-over and take-over) will enter a global 90-second (tunable)
locking grace period for every nfs service on the servers.
==> What does "locking grace" mean? Does it mean read-write access
allowed but no locks, or no acess at all?
If it is a new lock request, the lock call will hang until grace period
is over. This is to allow existing lock holders to reclaim their locks.
This has been part of the NFS-NLM protocol. Read and write can keep
going without restrictions.
B-4: If NFS-TCP is involved, failover should not be issued on the same pair
of machines multiple times within 30-minute period; for example,
failing over from node A to B, then immediately failing from B back to
A would hang the connection. This is to avoid TCP TIME_WAIT issue.
==> So what does this mean currently in TCP vs UDP world? Does it mean
nfs v3 UDP is the preferred method?
No. TCP is definitely a better protocol. Read the sentence carefully -
"failing over from node A to B, then immediately failing from B back to
A again will hang the connection".
-- Wendy
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster