> On Aug 22, 2024, at 1:20 PM, Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > > On Thu, Aug 22, 2024 at 01:07:29PM -0400, Jeff Layton wrote: >> On Thu, 2024-08-22 at 12:04 -0400, Mike Snitzer wrote: >>> On Thu, Aug 22, 2024 at 11:07:47AM -0400, Chuck Lever wrote: >>>> On Wed, Aug 21, 2024 at 05:23:56PM -0400, Mike Snitzer wrote: >>>>> On Wed, Aug 21, 2024 at 01:46:02PM -0400, Jeff Layton wrote: >>>>>> On Mon, 2024-08-19 at 14:17 -0400, Mike Snitzer wrote: >>>>>>> Fixes stop-gap used in previous commit where caller avoided using >>>>>>> tracepoint if rqstp is NULL. Instead, have each tracepoint avoid >>>>>>> dereferencing NULL rqstp. >>>>>>> >>>>>>> Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> >>>>>>> --- >>>>>>> fs/nfsd/nfsfh.c | 12 ++++-------- >>>>>>> fs/nfsd/trace.h | 36 +++++++++++++++++++++--------------- >>>>>>> 2 files changed, 25 insertions(+), 23 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c >>>>>>> index 19e173187ab9..bae727e65214 100644 >>>>>>> --- a/fs/nfsd/nfsfh.c >>>>>>> +++ b/fs/nfsd/nfsfh.c >>>>>>> @@ -195,8 +195,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst >>>>>>> *rqstp, struct net *net, >>>>>>> >>>>>>> error = nfserr_stale; >>>>>>> if (IS_ERR(exp)) { >>>>>>> - if (rqstp) >>>>>>> - trace_nfsd_set_fh_dentry_badexport(rqstp, >>>>>>> fhp, PTR_ERR(exp)); >>>>>>> + trace_nfsd_set_fh_dentry_badexport(rqstp, fhp, >>>>>>> PTR_ERR(exp)); >>>>>>> >>>>>>> if (PTR_ERR(exp) == -ENOENT) >>>>>>> return error; >>>>>>> @@ -244,8 +243,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst >>>>>>> *rqstp, struct net *net, >>>>>>> data_left, >>>>>>> fileid_type, 0, >>>>>>> nfsd_acceptable, >>>>>>> exp); >>>>>>> if (IS_ERR_OR_NULL(dentry)) { >>>>>>> - if (rqstp) >>>>>>> - >>>>>>> trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp, >>>>>>> + trace_nfsd_set_fh_dentry_badhandle(rqstp, >>>>>>> fhp, >>>>>>> dentry ? PTR_ERR(dentry) : >>>>>>> -ESTALE); >>>>>>> switch (PTR_ERR(dentry)) { >>>>>>> case -ENOMEM: >>>>>>> @@ -321,8 +319,7 @@ __fh_verify(struct svc_rqst *rqstp, >>>>>>> dentry = fhp->fh_dentry; >>>>>>> exp = fhp->fh_export; >>>>>>> >>>>>>> - if (rqstp) >>>>>>> - trace_nfsd_fh_verify(rqstp, fhp, type, access); >>>>>>> + trace_nfsd_fh_verify(net, rqstp, fhp, type, access); >>>>>>> >>>>>>> /* >>>>>>> * We still have to do all these permission checks, even >>>>>>> when >>>>>>> @@ -376,8 +373,7 @@ __fh_verify(struct svc_rqst *rqstp, >>>>>>> /* Finally, check access permissions. */ >>>>>>> error = nfsd_permission(cred, exp, dentry, access); >>>>>>> out: >>>>>>> - if (rqstp) >>>>>>> - trace_nfsd_fh_verify_err(rqstp, fhp, type, access, >>>>>>> error); >>>>>>> + trace_nfsd_fh_verify_err(net, rqstp, fhp, type, access, >>>>>>> error); >>>>>>> if (error == nfserr_stale) >>>>>>> nfsd_stats_fh_stale_inc(nn, exp); >>>>>>> return error; >>>>>>> diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h >>>>>>> index 77bbd23aa150..d49b3c1e3ba9 100644 >>>>>>> --- a/fs/nfsd/trace.h >>>>>>> +++ b/fs/nfsd/trace.h >>>>>>> @@ -195,12 +195,13 @@ TRACE_EVENT(nfsd_compound_encode_err, >>>>>>> >>>>>>> TRACE_EVENT(nfsd_fh_verify, >>>>>>> TP_PROTO( >>>>>>> + const struct net *net, >>>>>>> const struct svc_rqst *rqstp, >>>>>>> const struct svc_fh *fhp, >>>>>>> umode_t type, >>>>>>> int access >>>>>>> ), >>>>>>> - TP_ARGS(rqstp, fhp, type, access), >>>>>>> + TP_ARGS(net, rqstp, fhp, type, access), >>>>>>> TP_STRUCT__entry( >>>>>>> __field(unsigned int, netns_ino) >>>>>>> __sockaddr(server, rqstp->rq_xprt->xpt_remotelen) >>>>>>> @@ -212,12 +213,14 @@ TRACE_EVENT(nfsd_fh_verify, >>>>>>> __field(unsigned long, access) >>>>>>> ), >>>>>>> TP_fast_assign( >>>>>>> - __entry->netns_ino = SVC_NET(rqstp)->ns.inum; >>>>>>> - __assign_sockaddr(server, &rqstp->rq_xprt- >>>>>>>> xpt_local, >>>>>>> - rqstp->rq_xprt->xpt_locallen); >>>>>>> - __assign_sockaddr(client, &rqstp->rq_xprt- >>>>>>>> xpt_remote, >>>>>>> - rqstp->rq_xprt->xpt_remotelen); >>>>>>> - __entry->xid = be32_to_cpu(rqstp->rq_xid); >>>>>>> + __entry->netns_ino = net->ns.inum; >>>>>>> + if (rqstp) { >>>>>>> + __assign_sockaddr(server, &rqstp->rq_xprt- >>>>>>>> xpt_local, >>>>>>> + rqstp->rq_xprt- >>>>>>>> xpt_locallen); >>>>>>> + __assign_sockaddr(client, &rqstp->rq_xprt- >>>>>>>> xpt_remote, >>>>>>> + rqstp->rq_xprt- >>>>>>>> xpt_remotelen); >>>>>>> + } >>>>>> >>>>>> Does this need an else branch to set these values to something when >>>>>> rqstp is NULL, or are we guaranteed that they are already zeroed out >>>>>> when they aren't assigned? >>>>> >>>>> I'm not sure. It isn't immediately clear what is actually using these. >>>>> >>>>> But I did just notice an inconsistency, these entry members are defined: >>>>> >>>>> __sockaddr(server, rqstp->rq_xprt->xpt_remotelen) >>>>> __sockaddr(client, rqstp->rq_xprt->xpt_remotelen) >>>>> >>>>> Yet they go on to use rqstp->rq_xprt->xpt_locallen and >>>>> rqstp->rq_xprt->xpt_remotelen respectively. >>>>> >>>>> Chuck, would welcome your feedback on how to properly fix these >>>>> tracepoints to handle rqstp being NULL. And the inconsistency I just >>>>> noted is something extra. >>>> >>>> First, a comment about patch ordering: I think you can preserve >>>> attribution but make these a little easier to digest if you reverse >>>> 4/ and 5/. Fix the problem before it becomes a problem, as it were. >>>> >>>> As a general remark, I would prefer to retain the trace points and >>>> even the address information in the local I/O case: the client >>>> address is an important part of the decision to permit or deny >>>> access to the FH in question. The issue is how to make that >>>> happen... >>>> >>>> The __sockaddr() macros I think will trigger an oops if >>>> rqstp == NULL. The second argument determines the size of a >>>> variable-length trace field IIRC. One way to avoid that is to use a >>>> fixed size field for the addresses (big enough to store an IPv6 >>>> address? or an abstract address? those can get pretty big) >>>> >>>> I need to study 4/ more closely; perhaps it is doing too much in a >>>> single patch. (ie, the code ends up in a better place, but the >>>> details of the transition are obscured by being lumped together into >>>> one patch). >>>> >>>> So, can you or Neil answer: what would appear as the client address >>>> for local I/O ? >>> >>> Before when there was the "fake" svc_rqst it was initialized with: >>> >>> /* Note: we're connecting to ourself, so source addr == peer addr */ >>> rqstp->rq_addrlen = rpc_peeraddr(rpc_clnt, >>> (struct sockaddr *)&rqstp->rq_addr, >>> sizeof(rqstp->rq_addr)); >>> >>> Anyway, as the code is also now: the rpc_clnt passed to >>> nfsd_open_local_fh() will reflect the same address as the server. I agree it should be the client's actual address that is recorded in nfsd_open_local_fh() and later reported by trace points or other instrumentation. >>> My thinking was that for localio there doesn't need to be any explicit >>> listing of the address info in the tracepoints (but that'd be more >>> convincing if we at least logged localio by looking for and logging >>> NFSD_MAY_LOCALIO in mayflags passed to nfsd_file_acquire_local). >>> >>> But I agree it'd be nice to have tracepoints log matching 127.0.0.1 or >>> ::1, etc -- just don't think it strictly necessary. The usual usage scenario for the IP address in these particular trace points is to troubleshoot incorrect security policy settings (ie, export options based on client IP address). I'm not exactly clear on how export options apply to LOCALIO open requests -- that's the main focus of the security-related review I plan to do. (Similarity to remote NFSv3 behavior, as described in your other email reply, is indeed the preferred model, IMO). >>> Open to whatever you think best. >>> >> >> The client is likely to be coming from a different container in many >> cases and so won't be coming in via the loopback interface. Presenting >> a loopback address in that case seems wrong. > > Right, wasn't suggesting to always display loopback interface, was > saying ideal to display reality. There isn't anything special in the > rpc_clnt address. The localio RPC client is connecting to the server > using the same rpcclient as NFS (cloned from that used for NFS). > >> What would be ideal IMO would be to still display the addresses from >> the rpc_clnt and just display a flag or something that shows that this >> is a localio request. Having to pass that as an additional arg to >> __fh_verify would be pretty ugly though (and may be a layering >> violation). NFSD_MAY_LOCALIO might be set only for LOCALIO opens. I don't think it is today, but that could be one way to indicate the difference in callers. > I agree. But yeah, I don't have a sense for how to do this cleanly. > > Pass rpc_clnt in and only use it for address if mayflags has > NFSD_MAY_LOCALIO set _and_ rqstop is NULL? Will have a look, but I bet Neil might be the best person to propose an initial design. -- Chuck Lever