On Wed, Oct 13, 2021 at 5:03 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > > > > > On Oct 13, 2021, at 2:35 PM, David Wysochanski <dwysocha@xxxxxxxxxx> wrote: > > > > On Wed, Oct 13, 2021 at 12:50 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > >> > >> > >> > >>> On Oct 13, 2021, at 11:59 AM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > >>> > >>> On Wed, Oct 13, 2021 at 10:46:49AM -0400, Chuck Lever wrote: > >>>> This patch series moves forward with the removal of dprintk in > >>>> SUNRPC in favor of tracepoints. This is the last step for the > >>>> svcrdma component. > >>> > >>> Makes sense to me. > >>> > >>> I would like some (very short) documentation, somewhere. Partly just > >>> for my sake! I'm not sure exactly what to recommend to bug reporters. > >>> > >>> I guess > >>> > >>> trace-cmd record -e 'sunrpc:*' > >>> trace-cmd report > >>> > >>> would be a rough substitute for "rpcdebug -m rpc -s all"? > >> > >> It would, but tracepoints can be enabled one event > >> at a time. If you're looking for a direct replacement > >> for a specific rpcdebug invocation, it might be better > >> to examine the current sunrpc debug facilities and > >> provide specific command lines to mimic those. > >> > >> "rpcdebug -vh" gives us: > >> > >> rpc xprt call debug nfs auth bind sched trans svcsock svcdsp misc cache all > >> nfs vfs dircache lookupcache pagecache proc xdr file root callback client mount fscache pnfs pnfs_ld state all > >> nfsd sock fh export svc proc fileop auth repcache xdr lockd all > >> nlm svc client clntlock svclock monitor clntsubs svcsubs hostcache xdr all > >> > >> > >> If tracepoints are named carefully, we can provide > >> specific command lines to enable them as groups. So, > >> for instance, I was thinking rpcdebug might display: > >> > >> trace-cmd list | grep svcrdma > >> > >> to list tracepoints related to server side RDMA, or: > >> > >> trace-cmd list | grep svcsock > >> > >> to show tracepoints related to server side sockets. > >> Then: > >> > >> trace-cmd record -e sunrpc:svcsock\* > >> > >> enables just the socket-related trace events, which > >> coincidentally happens to line up with: > >> > >> rpcdebug -m rpc -s svcsock > >> > >> > >>> Do we have a couple examples of issues that could be diagnosed with > >>> tracepoints? > >> > >> Anything you can do with dprintk you can do with trace > >> points. Plus because tracepoints are lower overhead, they > >> can be enabled and used in production environments, > >> unlike dprintk. > >> > >> Also, tracepoints can trigger specific user space actions > >> when they fire. You could for example set up a tracepoint > >> in the RPC client that fires when a retransmit timeout > >> occurs, and it could trigger a script to start tcpdump. > >> > >> > >>> In the past I don't feel like I've ended up using dprintks > >>> all that much; somehow they're not usually where I need them. But maybe > >>> that's just me. And maybe as we put more thought into where tracepoints > >>> should be, they'll get more useful. > >> > >>> Documentation/filesystems/nfs/, or the linux-nfs wiki, could be easy > >>> places to put it. Though *something* in the man pages would be nice. > >>> At a minimum, a warning in rpcdebug(8) that we're gradually phasing out > >>> dprintks. > >> > >> As I understood the conversation last week, SteveD and > >> DaveW volunteered to be responsible for changes to > >> rpcdebug? > >> > > > > Well I don't remember it exactly like that, but it's probably close. > > > > I made a suggestion for the last kernel patch that deprecates any > > rpcdebug facility, to leave one dfprintk in, stating there is no > > information in the kernel anymore for this facility, so not to expect > > this rpcdebug flag to produce any meaningful debug output, and > > possibly redirect to ftrace facilities. I brought that idea up > > because of my fscache patches which totally removed the last dfprintk > > in NFS fscache, and I wasn't sure what the deprecation procedure > > was. As I recall you didn't like that idea as it was never done before > > with other rpcdebug flag deprecations, and it was shot down. > > > > I suppose we could put the same type of userspace patch to rpcdebug > > that looks for kernel versions and prints a message if someone tries > > to use a deprecated flag? Would that be better? > > I don't recall discussing leaving one dprintk in place for > each facility. My impression is that changing rpcdebug in > this manner is what was decided during that conversation. > Just to follow up on this since I think this was an action item more appropriate for me. FYI, I have spoken to a couple Red Hat support engineers and asked them to work on: 1. man page update for rpcdebug 2. rpcdebug warning if a flag is enabled on a kernel that does not produce any output They are making good progress and I hope they will post something to the list within week or two. > > >> So far we haven't had much documentation for dprintk. That > >> means we are starting more or less from scratch for > >> explaining observability in the NFS stacks. Free rein, and > >> all that. > >> > >> -- > >> Chuck Lever > > -- > Chuck Lever > > >