On 22 Feb 2023, at 7:22, Jeff Layton wrote: > On Wed, 2023-02-22 at 06:54 -0500, Benjamin Coddington wrote: >> On 22 Feb 2023, at 3:19, Florian Möller wrote: >> >>> Am 21.02.23 um 19:58 schrieb Benjamin Coddington: >>>> On 21 Feb 2023, at 11:52, Florian Möller wrote: >>>> >>>>> Hi Benjamin, >>>>> >>>>> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow. >>>>> >>>>> The command I used was >>>>> >>>>> touch test.txt && sleep 2 && touch test.txt >>>>> >>>>> test.txt did not exist previously. So you have an example of a touch without and with delay. >>>> >>>> Thanks! These are great - I can see from them that the client is indeed >>>> waiting in the stateid update mechanism because the server has returned >>>> NFS4ERR_STALE to the client's first CLOSE. >>>> >>>> That is unusual. The server is signaling that the open file's stateid is old, >>>> so I am interested to see if the first CLOSE is sent with the stateid's >>>> sequence that was returned from the server. I could probably see this if I >>>> had the server-side tracepoint data. >>> >>> Hi Benjamin, >>> >>> the server-side tracepoints >>> >>> nfsd:nfsd_preprocess >>> sunrpc:svc_process >>> >>> were enabled. It seems they did not produce any output. >>> >>> What I did now was: >>> - enable all nfsd tracepoints, >>> - enable all nfs4 tracepoints, >>> - enable all sunrpc tracepoints. >>> >>> The command I used was >>> >>> touch somefile && sleep 2 && touch somefile. >>> >>> Then I unmounted the NFS share - this also causes a delay. >>> >>> I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0. >> >> >> In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with >> NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also >> see: >> >> nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2 >> nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2 >> nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70 >> > > This just means that the kernel called into the "cache" infrastructure > to find an export entry, and there wasn't one. > > > Looking back at the original email here, I'd say this is expected since > the export wasn't set up for krb5i. > > Output of exportfs -v: > /export > gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash) > /export > gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash) Jeff and I were chatting on #linux-nfs, and its back to looking like a client issue.. as he's pointed out that the client starts out with krb5, then in Frame 8 (the CLOSE that receives STALE) switches to krb5i, then back to krb5 in Frame 10.. .. this should be reproduceable. Let me dust off my old kdc. Ben