Re: [RFC PATCH] NFSD: Support write delegations for pNFS LAYOUT operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jun 11, 2024, at 9:32 AM, Tom Talpey <tom@xxxxxxxxxx> wrote:
> 
> On 6/10/2024 7:54 PM, Trond Myklebust wrote:
>> On Mon, 2024-06-10 at 13:43 -0400, Chuck Lever wrote:
>>> On Mon, Jun 10, 2024 at 04:21:33PM +0000, Trond Myklebust wrote:
>>>> On Mon, 2024-06-10 at 11:04 -0400, cel@xxxxxxxxxx wrote:
>>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx>
>>>>> 
>>>>> I noticed LAYOUTGET(LAYOUTIOMODE4_RW) returning NFS4ERR_ACCESS
>>>>> unexpectedly. The NFS client had created a file with mode 0444,
>>>>> and
>>>>> the server had returned a write delegation on the OPEN(CREATE).
>>>>> The
>>>>> client was requesting a RW layout using the write delegation
>>>>> stateid
>>>>> so that it could flush file modifications.
>>>>> 
>>>>> This client behavior was permitted for NFSv4.1 without pNFS, so I
>>>>> began looking at NFSD's implementation of LAYOUTGET.
>>>>> 
>>>>> The failure was because fh_verify() was doing a permission check
>>>>> as
>>>>> part of verifying the FH. It uses the loga_iomode value to
>>>>> specify
>>>>> the @accmode argument. fh_verify(MAY_WRITE) on a file whose mode
>>>>> is
>>>>> 0444 fails with -EACCES.
>>>>> 
>>>>> RFC 8881 Section 18.43.3 states:
>>>>>> The use of the loga_iomode field depends upon the layout type,
>>>>>> but
>>>>>> should reflect the client's data access intent.
>>>>> 
>>>>> Further discussion of iomode values focuses on how the server is
>>>>> permitted to change returned the iomode when coalescing layouts.
>>>>> It says nothing about mandating the denial of LAYOUTGET requests
>>>>> due to file permission settings.
>>>>> 
>>>>> Appropriate permission checking is done when the client acquires
>>>>> the
>>>>> stateid used in the LAYOUTGET operation, so remove the permission
>>>>> check from LAYOUTGET and LAYOUTCOMMIT, and rely on layout stateid
>>>>> checking instead.
>>>> 
>>>> Hmm... I'm not seeing any checking or enforcement of the
>>>> EXCHGID4_FLAG_BIND_PRINC_STATEID flag in knfsd.
>>> 
>>> I appreciate your review!
>>> 
>>> I see that BIND_PRINC_STATEID is not set by NFSD. RFC 8881 Section
>>> 18.16.4 says:
>>>> Note that if the client ID was not created with the
>>>> EXCHGID4_FLAG_BIND_PRINC_STATEID capability set in the reply to
>>>> EXCHANGE_ID, then the server MUST NOT impose any requirement that
>>>> READs and WRITEs sent for an open file have the same credentials
>>>> as the OPEN itself, and the server is REQUIRED to perform access
>>>> checking on the READs and WRITEs themselves.
>>> 
>>> 
>>> Trond:
>>>> Doesn't that mean that
>>>> READ and WRITE are required to check access permissions, as per
>>>> RFC8881, section 13.9.2.3?
>>> 
>>> Every NFSv4 READ and WRITE calls nfs4_preprocess_stateid_op(), and
>>> nfs4_preprocess_stateid_op() calls nfsd_permission() (in
>>> nfs4_check_file()). Seems like we're covered.
>>> 
>>> 
>>> Trond:
>>>> I'm not saying that the return of an ACCESS error is correct here,
>>>> since the file was created using this credential, and so the caller
>>>> should benefit from having owner privileges.
>>> 
>>> The alternative is to preserve the accmode logic but instead add the
>>> NFSD_MAY_OWNER_OVERRIDE flag, me thinks, which is not objectionable.
>>> 
>>> 
>>> Trond:
>>>> However the issue is that knfsd should be either checking that the
>>>> credential is correct w.r.t. the stateid (if
>>>> EXCHGID4_FLAG_BIND_PRINC_STATEID is set and the stateid is not a
>>>> special stateid) or that it is correct w.r.t. the file permissions
>>>> (if
>>>> EXCHGID4_FLAG_BIND_PRINC_STATEID is not set or the stateid is a
>>>> special
>>>> stateid).
>>> 
>>> But LAYOUTGET is not a READ or WRITE. I don't see language that
>>> requires stateid / credential checking for it, but it's always
>>> possible I might have missed something.
>>> 
>>> Also, RFC 5663 has nothing to say about BIND_PRINC_STATEID. Further,
>>> I'm not sure how a SCSI READ or WRITE can check credentials as NFS
>>> stateids are not presented to SCSI/iSCSI targets.
>>> 
>>> Likewise, how would this impact a flexfile layout that targets
>>> an NFSv3 DS?
>>> 
>>> I think NFSD is checking stateids used for NFSv4 READ and WRITE as
>>> needed, but help me understand how BIND_PRINC_STATEID is applicable
>>> to pNFS block layouts...?
>>> 
>>> 
>> If you look at Section 15.2, then NFS4ERR_ACCESS is definitely listed
>> as a valid error for LAYOUTGET (in fact, the very first in the list).
>> Furthermore, please see the following blanket statement in RFC8881,
>> section 12.5.1.:
>>    "Layouts are provided to NFSv4.1 clients, and user access still
>>    follows the rules of the protocol as if they did not exist."
>> While you can argue that the above language is not normative, because
>> it doesn't use the official IETF "MUST" / "MUST NOT" /..., it clearly
>> does declare an intention to ensure that pNFS not be allowed to weaken
>> the protocol.
>> So my point would be that if LAYOUTGET is the only time where a proper
>> credential check can occur, then it needs to happen there. Otherwise,
>> your implementation is responsible for ensuring that it happens at some
>> other time, in order to ensure that user access follows the rules of
>> the protocol.
> 
> Absolutely agreed. The MDS needs to verify access before returning
> the layout, for no other reason that the DS can't perform the same
> verification.

I have trouble believing that something that critical has been
left by the spec authors as an implied implementation quality
issue.

The bulk of Sections 12.5.1 and .2 discuss how storage devices
need to handle things like writes outside of a layout and
other corner cases, including permission changes during I/O.
Clearly the spec authors expect that DS targets will deal
properly with these situations without the use of stateids.

BIND_PRINC_STATEID is never explicitly mentioned here.
Instead, the OPEN, LOCK, and ACCESS operations are clearly
mentioned as the points in time when file permission is
inspected by the server (ie, these operations provide the
main access verification check that precedes the successful
retrieval of a layout).

Block layouts then use SCSI persistent reservation and layout
recall to handle I/O-time access checking.

Therefore, IMO, BIND_PRINC_STATEID applies only to NFSv4
READ and WRITE, because the spec as it stands is not sane
otherwise.

However, the objection is noted. I'm working on an alternate
more conservative fix for the write delegation issue that
motivated this patch. Stay tuned for v2.


--
Chuck Lever






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux