Re: [PATCH v1 0/6] Deprecate dprintk in svcrdma

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Wed, 13 Oct 2021 16:49:55 +0000

> On Oct 13, 2021, at 11:59 AM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
> 
> On Wed, Oct 13, 2021 at 10:46:49AM -0400, Chuck Lever wrote:
>> This patch series moves forward with the removal of dprintk in
>> SUNRPC in favor of tracepoints. This is the last step for the
>> svcrdma component.
> 
> Makes sense to me.
> 
> I would like some (very short) documentation, somewhere.  Partly just
> for my sake!  I'm not sure exactly what to recommend to bug reporters.
> 
> I guess 
> 
> 	trace-cmd record -e 'sunrpc:*'
> 	trace-cmd report
> 
> would be a rough substitute for "rpcdebug -m rpc -s all"?

It would, but tracepoints can be enabled one event
at a time. If you're looking for a direct replacement
for a specific rpcdebug invocation, it might be better
to examine the current sunrpc debug facilities and
provide specific command lines to mimic those.

"rpcdebug -vh" gives us:

rpc        xprt call debug nfs auth bind sched trans svcsock svcdsp misc cache all
nfs        vfs dircache lookupcache pagecache proc xdr file root callback client mount fscache pnfs pnfs_ld state all
nfsd       sock fh export svc proc fileop auth repcache xdr lockd all
nlm        svc client clntlock svclock monitor clntsubs svcsubs hostcache xdr all

If tracepoints are named carefully, we can provide
specific command lines to enable them as groups. So,
for instance, I was thinking rpcdebug might display:

	trace-cmd list | grep svcrdma

to list tracepoints related to server side RDMA, or:

	trace-cmd list | grep svcsock

to show tracepoints related to server side sockets.
Then:

	trace-cmd record -e sunrpc:svcsock\*

enables just the socket-related trace events, which
coincidentally happens to line up with:

	rpcdebug -m rpc -s svcsock

> Do we have a couple examples of issues that could be diagnosed with
> tracepoints?

Anything you can do with dprintk you can do with trace
points. Plus because tracepoints are lower overhead, they
can be enabled and used in production environments,
unlike dprintk.

Also, tracepoints can trigger specific user space actions
when they fire. You could for example set up a tracepoint
in the RPC client that fires when a retransmit timeout
occurs, and it could trigger a script to start tcpdump.

> In the past I don't feel like I've ended up using dprintks
> all that much; somehow they're not usually where I need them.  But maybe
> that's just me.  And maybe as we put more thought into where tracepoints
> should be, they'll get more useful.

> Documentation/filesystems/nfs/, or the linux-nfs wiki, could be easy
> places to put it.  Though *something* in the man pages would be nice.
> At a minimum, a warning in rpcdebug(8) that we're gradually phasing out
> dprintks.

As I understood the conversation last week, SteveD and
DaveW volunteered to be responsible for changes to
rpcdebug?

So far we haven't had much documentation for dprintk. That
means we are starting more or less from scratch for
explaining observability in the NFS stacks. Free rein, and
all that.

--
Chuck Lever