On Wed, Dec 3, 2014 at 5:59 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > On Wed, Dec 3, 2014 at 3:59 PM, Trond Myklebust > <trond.myklebust@xxxxxxxxxxxxxxx> wrote: >> Hi Olga, >> >> On Wed, Dec 3, 2014 at 2:54 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: >>> Hi folks, >>> >>> I would like an opinion about changing code in such as way that we >>> don't select a delegation stateid for an IO operation when this >>> particular delegation is being returned. >>> >>> The reason it's some what problematic is that we send out a >>> DELEG_RETURN and then we don't remove the stateid until we receive a >>> reply. In the mean while, an IO operation can be happening and in >>> nfs4_select_rw_stateid() it sees a delegation stateid and uses it. >>> Well, at the server, it finishes processing DELEG_RETURN before >>> getting an IO op and by that time the server is finished with the >>> stateid and can error an IO operation with BAD_STATEID. >>> >>> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c >>> index 7f3f606..4f6f6c9 100644 >>> --- a/fs/nfs/delegation.c >>> +++ b/fs/nfs/delegation.c >>> @@ -854,7 +854,8 @@ bool nfs4_copy_delegation_stateid(nfs4_stateid >>> *dst, struct inode *in >>> flags &= FMODE_READ|FMODE_WRITE; >>> rcu_read_lock(); >>> delegation = rcu_dereference(nfsi->delegation); >>> - ret = (delegation != NULL && (delegation->type & flags) == flags); >>> + ret = (delegation != NULL && (delegation->type & flags) == flags && >>> + !test_bit(NFS_DELEGATION_RETURNING, &delegation->flags)); >>> if (ret) { >>> nfs4_stateid_copy(dst, &delegation->stateid); >>> nfs_mark_delegation_referenced(delegation); >> >> The above cannot eliminate the possibility that we won't use a >> delegation while it is being returned. It will at best just reduce the >> window of opportunity. >> > > You are right this are still problems. Actually, we might set the bit > but not yet get the open stateid from the open with deleg_cur and > that's not good. It would be good to know we got the open stateid and > then pick that. > >> So, why is this being considered to be a problem in the first place? >> Are you seeing a measurable performance impact on a real life workload >> (as opposed to some 1-in-a-billion occurrence from a QA test :-))? > > Unfortunately, this problem is quite common and I hit it all the time > on my setup. This leads to client seizing IO on that file and > returning EIO. It's an unrecoverable error. I'm trying to figure out > how to eliminate getting to that state. > It definitely isn't intended to be an irrecoverable error. The client is supposed to just replay the write after updating the stateid. -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html