I've found this extremely useful on clients in tracking down 'lost' delegations. echo "error != 0" | tee /sys/kernel/debug/tracing/events/nfs4/nfs4_delegreturn_exit/filter ...and then look in here: cat /sys/kernel/debug/tracing/trace (YMMV, not sure if this is going to work on your distro, debugfs etc) There's still work to be done with nfsd4_delegreturn() and revoked delegations serverside (as well as killing fh_verify() per Bruce's earlier suggestions) We've recently seen the server recall a delegation, revoke it, and then have the client try to return it much later (because of an unknown slowness issue) -- after the file had been deleted at the server. Jason L Tibbitts III <tibbs@xxxxxxxxxxx> writes: >>>>>> "JBF" == J Bruce Fields <bfields@xxxxxxxxxxxx> writes: > > JBF> So, you're using NFSv4.1 or 4.2, and the server thinks that the > JBF> client has reused a (slot, sequence number) pair, but the server > JBF> doesn't have a cached response to return. > > Thanks for the reply. Sadly I don't understand all of it, but... > > JBF> Hard to know how that happened, and it's not shown in the below. > JBF> Sounds like a bug, though. > > Yeah, I only found the problem after it was already happening, so > obviously the beginning of the process is missing. And sadly it's not > something I can easily repeat, so short of running some continuous > package capture (which would be hard since once this starts the traffic > volume is huge) there's no easy way to see it. > > Is there any state on either the client or server that I could inspect > which might give any hints? I can add that to my notes in case this > problem happens again. > > JBF> Recent clients will use sec=krb5 for certain state-related > JBF> operations even if you mount with sec=sys, so it's still possible > JBF> it could be involved here. > > On the server, the involved filesystem isn't exported with any sec= > options, in case it matters. > > JBF> The SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag set in the OPEN > JBF> replies is also a sign something's gone wrong. Apparently the > JBF> server thinks the client has failed to return a delegation. > > I can't imagine how that might have happened. There is nothing else > NFS-related in the client's log besides the spew and that final line. > There are some automount complaints about the user accessing directories > that aren't in the map sources, and the usual random gssproxy noise > which was fixed in Fedora 24. > > Currently the system is stable; it hasn't been rebooted since the > problem occurred. Everything cleared up once I was able to unmounted > the problematic filesystem. > > - J< > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Andrew W. Elble aweits@xxxxxxxxxxxxxxxxxx Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html