> On Jul 29, 2021, at 23:15, Matthew Robertson <mrobertson@xxxxxxxxxxxxxxx> wrote: > > Doing a simple open of a file from 2 readers and there is a strange 5s > delay on the second open. We did NOT see the issue on kernel: > CentOS 4.18.0-193.6.3.el8_2.x86_64 > but the issue is present in: > CentOS 4.18.0-305.7.1.el8_4.x86_64 Your OPEN implementation is fundamentally broken. You’re handing out an illegal stateid. The client is not obligated to take into account an implementation that does not bump the seqid as required by the NFSv4 spec. > > > #include <stdio.h> > > int main() { > > FILE * file_pointer; > > char buffer[50], c; > > file_pointer = fopen("/mnt/nfsv4/test/t.txt", "r"); > > fgets(buffer, 50, file_pointer); > > file_pointer = fopen("/mnt/nfsv4/test/t.txt", "r"); > > fgets(buffer, 50, file_pointer); > > fclose(file_pointer); > > return 0; > > } > > > > Call Graph from ftrace > > _nfs4_opendata_to_nfs4_state() > > nfs_refresh_inode() { > > nfs_refresh_inode.part.28() { > > nfs_refresh_inode_locked() { > > update_open_stateid() > > ** 5s wait ** > > Then the open / read completes > > > We do not see this issue on nfsv3 only nfsv4. > > > Here is the wireshark of the request and response: > > The second request and response happens immediately in the wireshark. > The delay is in the kernel. > > > > Frame 25: 378 bytes on wire (3024 bits), 378 bytes captured (3024 bits) > Ethernet II, Src: fa:16:3e:88:e2:24 (fa:16:3e:88:e2:24), Dst: > PureStor_5b:45:10 (24:a9:37:5b:45:10) > Internet Protocol Version 4, Src: 10.15.132.250, Dst: 10.15.128.15 > Transmission Control Protocol, Src Port: 710, Dst Port: 2049, Seq: > 2125, Ack: 1193, Len: 312 > Remote Procedure Call, Type:Call XID:0xed175b8d > Network File System, Ops(5): SEQUENCE, PUTFH, OPEN, ACCESS, GETATTR > [Program Version: 4] > [V4 Procedure: COMPOUND (1)] > Tag: <EMPTY> > minorversion: 1 > Operations (count: 5): SEQUENCE, PUTFH, OPEN, ACCESS, GETATTR > Opcode: SEQUENCE (53) > sessionid: 00c800000000000c0000000000000000 > seqid: 0x00000073 > slot id: 0 > high slot id: 0 > cache this?: Yes > Opcode: PUTFH (22) > FileHandle > Opcode: OPEN (18) > seqid: 0x00000000 > share_access: OPEN4_SHARE_ACCESS_READ (1) > share_deny: OPEN4_SHARE_DENY_NONE (0) > clientid: 0x00c8000000000017 > owner: <DATA> > Open Type: OPEN4_NOCREATE (0) > Claim Type: CLAIM_FH (4) > Opcode: ACCESS (3), [Check: RD MD XT XE] > Check access: 0x2d > .... ...1 = 0x001 READ: allowed? > .... .1.. = 0x004 MODIFY: allowed? > .... 1... = 0x008 EXTEND: allowed? > ..1. .... = 0x020 EXECUTE: allowed? > Opcode: GETATTR (9) > Attr mask[0]: 0x0010011a (Type, Change, Size, FSID, FileId) > reqd_attr: Type (1) > reqd_attr: Change (3) > reqd_attr: Size (4) > reqd_attr: FSID (8) > reco_attr: FileId (20) > Attr mask[1]: 0x0030a23a (Mode, NumLinks, Owner, > Owner_Group, RawDev, Space_Used, Time_Access, Time_Metadata, > Time_Modify) > reco_attr: Mode (33) > reco_attr: NumLinks (35) > reco_attr: Owner (36) > reco_attr: Owner_Group (37) > reco_attr: RawDev (41) > reco_attr: Space_Used (45) > reco_attr: Time_Access (47) > reco_attr: Time_Metadata (52) > reco_attr: Time_Modify (53) > [Main Opcode: OPEN (18)] > > > > > > Frame 26: 374 bytes on wire (2992 bits), 374 bytes captured (2992 bits) > Ethernet II, Src: PureStor_5b:45:10 (24:a9:37:5b:45:10), Dst: > fa:16:3e:88:e2:24 (fa:16:3e:88:e2:24) > Internet Protocol Version 4, Src: 10.15.128.15, Dst: 10.15.132.250 > Transmission Control Protocol, Src Port: 2049, Dst Port: 710, Seq: > 1193, Ack: 2437, Len: 308 > Remote Procedure Call, Type:Reply XID:0xed175b8d > Network File System, Ops(5): SEQUENCE PUTFH OPEN ACCESS GETATTR > [Program Version: 4] > [V4 Procedure: COMPOUND (1)] > Status: NFS4_OK (0) > Tag: <EMPTY> > Operations (count: 5) > Opcode: SEQUENCE (53) > Status: NFS4_OK (0) > sessionid: 00c800000000000c0000000000000000 > seqid: 0x00000073 > slot id: 0 > high slot id: 127 > target high slot id: 127 > status flags: 0x00000000 > Opcode: PUTFH (22) > Status: NFS4_OK (0) > Opcode: OPEN (18) > Status: NFS4_OK (0) > StateID > [StateID Hash: 0xaa04] > StateID seqid: 1 > StateID Other: 000000000000000000000000 > [StateID Other hash: 0x60de] > change_info > Atomic: Yes > changeid (before): 1626267251187000000 > changeid (after): 1626267251187000000 > result flags: 0x00000004, locktype posix > Delegation Type: OPEN_DELEGATE_NONE (0) > Opcode: ACCESS (3), [NOT Supported: LU DL], [Access Denied: LU > MD XT DL XE], [Allowed: RD] > Status: NFS4_OK (0) > Supported types (of requested): 0x3f > Access rights (of requested): 0x01 > .... ...1 = 0x001 READ: allowed > .... ..0. = 0x002 LOOKUP: *Access Denied* > .... .0.. = 0x004 MODIFY: *Access Denied* > .... 0... = 0x008 EXTEND: *Access Denied* > ...0 .... = 0x010 DELETE: *Access Denied* > ..0. .... = 0x020 EXECUTE: *Access Denied* > Opcode: GETATTR (9) > Status: NFS4_OK (0) > Attr mask[0]: 0x0010011a (Type, Change, Size, FSID, FileId) > reqd_attr: Type (1) > ftype4: NF4REG (1) > reqd_attr: Change (3) > changeid: 1626267251187000000 > reqd_attr: Size (4) > size: 7 > reqd_attr: FSID (8) > fattr4_fsid > reco_attr: FileId (20) > fileid: 20829148276664444 > Attr mask[1]: 0x0030a23a (Mode, NumLinks, Owner, > Owner_Group, RawDev, Space_Used, Time_Access, Time_Metadata, > Time_Modify) > reco_attr: Mode (33) > mode: 0644, Name: Unknown, Read permission for > owner, Write permission for owner, Read permission for group, Read > permission for others > reco_attr: NumLinks (35) > numlinks: 1 > reco_attr: Owner (36) > fattr4_owner: 0 > reco_attr: Owner_Group (37) > fattr4_owner_group: 0 > reco_attr: RawDev (41) > specdata1: 0 > specdata2: 0 > reco_attr: Space_Used (45) > space_used: 7 > reco_attr: Time_Access (47) > seconds: 1626267228 > nseconds: 0 > reco_attr: Time_Metadata (52) > seconds: 1626267251 > nseconds: 187000000 > reco_attr: Time_Modify (53) > seconds: 1626267251 > nseconds: 186841000 > [Main Opcode: OPEN (18)] > > > -- > Matt Robertson // Data Architect | Pure Storage, Inc. > 913-915-1162 | mrobertson@xxxxxxxxxxxxxxx _________________________________ Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx