Re: [PATCH] NFSD: remove redundant dprintk in exp_rootfh

"NeilBrown" <neilb@xxxxxxx> · Tue, 09 Apr 2024 13:06:00 +1000

On Tue, 09 Apr 2024, Chuck Lever wrote:
> On Tue, Apr 09, 2024 at 11:12:12AM +1000, NeilBrown wrote:
> > On Tue, 09 Apr 2024, Chuck Lever wrote:
> > > On Tue, Apr 09, 2024 at 09:21:54AM +1000, NeilBrown wrote:
> > > > On Tue, 09 Apr 2024, Chen Hanxiao wrote:
> > > > > trace_nfsd_ctl_filehandle in write_filehandle has
> > > > > some similar infos.
> > > > 
> > > > Not all that similar.  The dprintk you are removing includes the inode
> > > > number and sb->s_id which the trace point don't include.
> > > > 
> > > > Why do you think that information isn't needed?
> > > 
> > > I asked him to remove that dprintk.
> > > 
> > > Can you say why you think that information is useful to provide
> > > via a dprintk? It doesn't seem useful to system administrators,
> > > IMO.
> > 
> > I'm not saying it is useful, but I don't think the onus is on me.
> 
> No the onus isn't on you. I'm merely asking for feedback (and I
> explain why below).
> 
> 
> > When removing tracing information, the commit message should at least
> > acknowledge what is being removed and make some attempt to justify it.
> > In this case the commit message claimed that nothing was being removed,
> > which is clearly false.  Maybe just the commit message needs to be
> > fixed.
> 
> Agreed, the patch description could have more detail and a proper
> justification.
> 
> 
> > I don't think these tracepoints are just for system administrators.
> > They are also for tech support when they are trying to remotely diagnose
> > a customer problem.  It is really hard to know what will actually be
> > useful.  Many times I have found that the particular information that I
> > want isn't available in any tracing.  I expect this is inevitable.  We
> > cannot trace EVERYTHING so there will always be gaps.
> 
> When there is no in-code tracepoint available, the usual course of
> action these days is to wheel out BPF, systemtap, or drgn. Distro
> support engineers know how to do that. Server administrators might
> not be so well trained, so I consider dprintk() of primary
> importance to them.

Point for clarification: do you see tracepoints and dprintk as having
different audiences, or do you see them as serving the same purpose with
dprintk being a legacy implementation which is slowly being transitioned
to tracepoints?  I had assumed the latter but your language above makes
me wonder.

Certainly systemtap and other are invaluable and should be understood by
support engineers, but its a whole lot easier (and quicker) if the
useful information is easily available.  Our first-level support can
easily ask for a rpcdebug or trace-cmd trace and if we can get all the
useful information from there, that is a big win.
(It'll be nice when we can stop using rpcdebug and only ask for
trace-cmd output).

> 
> Here the dprintk() is reporting information that seems useful only
> to kernel developers. That's a code smell IMO. And the guidance in
> these cases, historically, has been to remove such observability
> either before a patch is merged, or after the fact, as we're doing
> here.

Interesting...  I had always assumed that dprintk/trace was largely
useless without some understanding of the inner workings of the kernel. 
I certainly need to dig around before I can work out how to interpret
the trace information.  So I think of all of it as "useful only to
kernel developers" ... or potential developers.

> 
> 
> > But removing some information that was previously generated seems
> > like a regression that should at least be justified.
> > In the case of write_filehandle() we are now tracing the request, but
> > not the result.  Is this generally sensible?
> 
> Let's instead look at the specific situation. The purpose of the
> nfsd_ctl_* tracepoints is to record in the trace log when 
> configuration changes are made, in order to juxtapose those with
> other server operation.

configuration changes?  write_filehandle is used by mountd to get a
filehandle to return to a v3 MOUNT request.  I guess that is a config
change on the client, but not on the server.

> 
> So, here, it's quite sensible. We want to observe the information
> that was passed from user space, and the starting timestamp.
> 
> 
> > Would it not make sense to
> > trace both after the core of the function (exp_rootfh) has completed?
> 
> In some cases there are already tracepoints that would report or
> infer the new state, so reporting a result would be redundant in
> those cases.

Well there was already a dprintk, but that is being removed...  I don't
know what other tracepoint might already provide this info.  Maybe I'm
confused about your meaning.

> 
> 
> > At lease the knfsd_fh_hash() of the generated filehandle could be
> > reported.
> 
> Chen's original patch replaced the dprintk with a tracepoint. So I
> asked, a week or so ago, for exactly this kind of feedback. There
> have been no responses until now. Therefore it seemed logical to
> assume no-one had a use for this info.

Sorry - I don't pay as close attention to nfs traffic as I might like.
(In this case I was on leave a week ago...)

> 
> The folks whom this information would serve have been silent to date
> with any specific suggestions, and usually we hear from someone
> quickly when removing observability that is depended upon.
> 
> We could record the FH hash, but what would it be used for? User
> space requested the FH, which can be reported by the requesting
> program.

What is the FH hash ever used for?  Presumably for tracking a particular
object through multiple tracepoints.  Maybe we need to see when the dir
that the client mounted gets accessed later?

> 
> I'm not hearing a convincing specific justification for maintaining
> observability here.... but we have a few more weeks before making a
> more permanent decision.
> 

I'm not seeing a significant cost in maintaining observability.  But I
don't know that I care quite enough to write a patch.  So I'll leave it
up to you.

Thanks,
NeilBrown

> 
> -- 
> Chuck Lever
>