On 05/22/2012 01:44 AM, Raghavendra Gowdappa wrote:
----- Original Message -----
From: "Anand Avati"<aavati@xxxxxxxxxx>
To: "Raghavendra Gowdappa"<rgowdapp@xxxxxxxxxx>
Cc: "Pranith Kumar Karampuri"<pkarampu@xxxxxxxxxx>, "Vijay Bellur"<vbellur@xxxxxxxxxx>, "Amar Tumballi"
<atumball@xxxxxxxxxx>, "Krishnan Parthasarathi"<kparthas@xxxxxxxxxx>, gluster-devel@xxxxxxxxxx
Sent: Tuesday, May 22, 2012 12:41:36 PM
Subject: Re: RFC on fix to bug #802414
<in continuation from our chat>
The PARENT_DOWN_HANDLED approach will take us backwards from the
current
state where we are resiliant to frame losses and other class of bugs
(i.e, if a frame loss happens on either server or client, it only
results in prevented graph cleanup but the graph switch still
happens).
The root "cause" here is that we are giving up on a very important
and
fundamental principle of immutability on the fd object. The real
solution here is to never modify fd->inode. Instead we must bring
about
a more native fd "migration" than just re-opening an existing fd on
the
new graph.
Think of the inode migration analogy. The handle coming from FUSE
(the
address of the object) is a "hint". Usually the hint is right, if the
object in the address belongs to the latest graph. If not, using the
GFID we resolve a new inode on the latest graph and use it.
In case of FD we can do something similar, except there are not GFIDs
(which should not be a problem). We need to make the handle coming
from
FUSE (the address of fd_t) just a hint. If the
fd->inode->table->xl->graph is the latest, then the hint was a HIT.
If
the graph was not the latest, we look for a previous migration
attempt+result in the "base" (original) fd's context. If that does
not
exist or is not fresh (on the latest graph) then we do a new fd
creation, open on new graph, fd_unref the old cached result in the fd
context of the "base fd" and keep ref to this new result. All this
must
happen from fuse_resolve_fd(). The setting of the latest fd and
updation
of the latest fd pointer happens under the scope of the
base_fd->lock()
which gives it a very clear and unambiguous scope which was missing
with
the old scheme.
I remember discussing this solution during initial design. But, not sure why we dropped it. So, Can I go ahead with the implementation? Is this fix required post 3.3?
The solution you are probably referring to was dropped because there we
were talking about chaining FDs to the one on the "next graph" as graphs
keep getting changed. The one described above is different because here
there will one base fd (the original one on which open() by fuse was
performed) and new graphs result in creation of an internal new fd
directly referred by the base fd (and naturally unref the previous "new
fd") thereby keeping things quite trim.
Avati