Re: [PATCH 1/3] pnfs: avoid incorrect use of layout stateid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2011-02-01 18:31, Fred Isaman wrote:
> 
> On Feb 1, 2011, at 10:38 AM, Benny Halevy wrote:
> 
>> On 2011-01-31 17:27, Fred Isaman wrote:
>>> The code could violate the following from RFC5661, section 12.5.3:
>>> "Once a client has no more layouts on a file, the layout stateid is no
>>> longer valid and MUST NOT be used."
>>
>> NACK.
>>
>> Fred, this is your interpretation of the forgetful model and it is taken
>> out of context.
>>
>> Until the spec is changed only the server invalidates the stateid by returning
>> lrs_present=false on LAYOUTRETURN. Merely forgetting the layout state without
>> LAYOUTRETURN does not determine that.
>>
>>
>> Also from 12.5.3:
>>   Once a layout stateid is established, the "other"
>>   field will stay constant unless the stateid is revoked or the client
>>   returns all layouts on the file and the server disposes of the stateid.
>>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
> 
> 
> OK, so the question is, does sending a LAYOUTGET with openstateid "return all layouts".
> Making the answer to this "yes" is perfectly consistent with the spec, given its complete absence
> of direction here.

I disagree, and so are other server implementations, including linux-pnfs!
(It will return BAD_STATEID if the client forgets the layout
in any case but LAYOUTRETURN or CB_LAYOUTRECALL/NOMATCHING_LAYOUT)

>  The alternative is a forgetful model where the client can forget layouts, but
> not layout stateid.

Right, and this is the direction we should go until the errata is in place.

> 
> The question is, is there any objection to the above interpretation of LAYOUTGET with openstateid?
> Because if not, then we will shortly get an appropriate errata,
> and there is no good reason to delay the patch until the errata is finalized.
> I know previously you had objected on the basis that this prevents parallel use of openstateid with LAYOUTGET.
> But given that you had indicated that such parallel use can not be done for other reasons,
> I have been working on the assumption that the interpretation above will be accepted without controversy at the upcoming IETF.
> 
> On the other hand, if there is objection to the above interpretation, that should be made known.

I simulated this with the layout-sim and given the LAYOUTGET sent with the initial stateid
is always fully serialized with other layout stateid-changing operations, this model works.

I object the timing, before the IETF discussion is over and before the upcoming Connectathon
where other server vendors have no chance to implement this erratum.

Benny

> 
> Fred
> 
> 
>> and
>>
>>   Once the client receives a layout stateid, it MUST use the correct
>>   "seqid" for subsequent LAYOUTGET or LAYOUTRETURN operations.  The
>>   correct "seqid" is defined as the highest "seqid" value from
>>   responses of fully processed LAYOUTGET or LAYOUTRETURN operations or
>>   arguments of a fully processed CB_LAYOUTRECALL operation.
>>
>> and 18.43.3
>>
>>   The loga_stateid field specifies a valid stateid.  If a layout is not
>>   currently held by the client, the loga_stateid field represents a
>>   stateid reflecting the correspondingly valid open, byte-range lock,
>>   or delegation stateid.  Once a layout is held on the file by the
>>   client, the loga_stateid field MUST be a stateid as returned from a
>>   previous LAYOUTGET or LAYOUTRETURN operation or provided by a
>>   CB_LAYOUTRECALL operation (see Section 12.5.3).
>>
>> Only when we agree that the client can throw away the layout state and
>> send a singular future LAYOUTGET with a non-layout stateid - something it MUST not
>> do at the moment - only then we can do what this patch suggests.
>>
>> Benny
>>
>>>
>>> This can occur when a layout already has a lseg, starts another
>>> non-everlapping LAYOUTGET, and a CB_LAYOUTRECALL for the existing lseg
>>> is processed before we hit pnfs_layout_process().
>>>
>>> Solve by setting, each time the client has no more lsegs for a file, a
>>> flag which blocks further use of the layout and triggers its removal.
>>>
>>> This also fixes a second bug which occurs in the same instance as
>>> above.  If we actually use pnfs_layout_process, we add the new lseg to
>>> the layout, but the layout has been removed from the nfs_client list
>>> by the intervening CB_LAYOUTRECALL and will not be added back.  Thus
>>> the newly acquired lseg will not be properly returned in the event of
>>> a subsequent CB_LAYOUTRECALL.
>>>
>>> Signed-off-by: Fred Isaman <iisaman@xxxxxxxxxx>
>>> ---
>>> fs/nfs/pnfs.c |   12 +++++++++---
>>> 1 files changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>>> index 1b1bc1a..dcd5d54 100644
>>> --- a/fs/nfs/pnfs.c
>>> +++ b/fs/nfs/pnfs.c
>>> @@ -255,6 +255,9 @@ put_lseg_locked(struct pnfs_layout_segment *lseg,
>>> 			list_del_init(&lseg->pls_layout->plh_layouts);
>>> 			spin_unlock(&clp->cl_lock);
>>> 			clear_bit(NFS_LAYOUT_BULK_RECALL, &lseg->pls_layout->plh_flags);
>>> +			set_bit(NFS_LAYOUT_DESTROYED, &lseg->pls_layout->plh_flags);
>>> +			/* Matched by initial refcount set in alloc_init_layout_hdr */
>>> +			put_layout_hdr_locked(lseg->pls_layout);
>>> 		}
>>> 		rpc_wake_up(&NFS_SERVER(ino)->roc_rpcwaitq);
>>> 		list_add(&lseg->pls_list, tmp_list);
>>> @@ -299,6 +302,11 @@ mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
>>>
>>> 	dprintk("%s:Begin lo %p\n", __func__, lo);
>>>
>>> +	if (list_empty(&lo->plh_segs)) {
>>> +		if (!test_and_set_bit(NFS_LAYOUT_DESTROYED, &lo->plh_flags))
>>> +			put_layout_hdr_locked(lo);
>>> +		return 0;
>>> +	}
>>> 	list_for_each_entry_safe(lseg, next, &lo->plh_segs, pls_list)
>>> 		if (should_free_lseg(lseg->pls_range.iomode, iomode)) {
>>> 			dprintk("%s: freeing lseg %p iomode %d "
>>> @@ -332,10 +340,7 @@ pnfs_destroy_layout(struct nfs_inode *nfsi)
>>> 	spin_lock(&nfsi->vfs_inode.i_lock);
>>> 	lo = nfsi->layout;
>>> 	if (lo) {
>>> -		set_bit(NFS_LAYOUT_DESTROYED, &nfsi->layout->plh_flags);
>>> 		mark_matching_lsegs_invalid(lo, &tmp_list, IOMODE_ANY);
>>> -		/* Matched by refcount set to 1 in alloc_init_layout_hdr */
>>> -		put_layout_hdr_locked(lo);
>>> 	}
>>> 	spin_unlock(&nfsi->vfs_inode.i_lock);
>>> 	pnfs_free_lseg_list(&tmp_list);
>>> @@ -403,6 +408,7 @@ pnfs_layoutgets_blocked(struct pnfs_layout_hdr *lo, nfs4_stateid *stateid,
>>> 	    (int)(lo->plh_barrier - be32_to_cpu(stateid->stateid.seqid)) >= 0)
>>> 		return true;
>>> 	return lo->plh_block_lgets ||
>>> +		test_bit(NFS_LAYOUT_DESTROYED, &lo->plh_flags) ||
>>> 		test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags) ||
>>> 		(list_empty(&lo->plh_segs) &&
>>> 		 (atomic_read(&lo->plh_outstanding) > lget));
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux