Re: not picking a delegation stateid for IO when delegation stateid is being returned

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 3, 2014 at 3:59 PM, Trond Myklebust
<trond.myklebust@xxxxxxxxxxxxxxx> wrote:
> Hi Olga,
>
> On Wed, Dec 3, 2014 at 2:54 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>> Hi folks,
>>
>> I would like an opinion about changing code in such as way that we
>> don't select a delegation stateid for an IO operation when this
>> particular delegation is being returned.
>>
>> The reason it's some what problematic is that we send out a
>> DELEG_RETURN and then we don't remove the stateid until we receive a
>> reply. In the mean while, an IO operation can be happening and in
>> nfs4_select_rw_stateid() it sees a delegation stateid and uses it.
>> Well, at the server, it finishes processing DELEG_RETURN before
>> getting an IO op and by that time the server is finished with the
>> stateid and can error an IO operation with BAD_STATEID.
>>
>> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
>> index 7f3f606..4f6f6c9 100644
>> --- a/fs/nfs/delegation.c
>> +++ b/fs/nfs/delegation.c
>> @@ -854,7 +854,8 @@ bool nfs4_copy_delegation_stateid(nfs4_stateid
>> *dst, struct inode *in
>>         flags &= FMODE_READ|FMODE_WRITE;
>>         rcu_read_lock();
>>         delegation = rcu_dereference(nfsi->delegation);
>> -       ret = (delegation != NULL && (delegation->type & flags) == flags);
>> +       ret = (delegation != NULL && (delegation->type & flags) == flags &&
>> +               !test_bit(NFS_DELEGATION_RETURNING, &delegation->flags));
>>         if (ret) {
>>                 nfs4_stateid_copy(dst, &delegation->stateid);
>>                 nfs_mark_delegation_referenced(delegation);
>
> The above cannot eliminate the possibility that we won't use a
> delegation while it is being returned. It will at best just reduce the
> window of opportunity.
>

You are right this are still problems. Actually, we might set the bit
but not yet get the open stateid from the open with deleg_cur and
that's not good. It would be good to know we got the open stateid and
then pick that.

> So, why is this being considered to be a problem in the first place?
> Are you seeing a measurable performance impact on a real life workload
> (as opposed to some 1-in-a-billion occurrence from a QA test :-))?

Unfortunately, this problem is quite common and I hit it all the time
on my setup. This leads to client seizing IO on that file and
returning EIO. It's an unrecoverable error. I'm trying to figure out
how to eliminate getting to that state.


>
> Cheers
>   Trond
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> trond.myklebust@xxxxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux