Hi Steve and David, On internal testing of CIFS over RDMA ,the cliet[linux] is getting reset and its giving `Cannot open temporary file for read` error. Upon further debugging, we found that the QP is transitioning from the RTS state to the ERROR state due to a peer_abort and kernel bisect points to commit d08089f649a0cfb2099c8551ac47eef0cc23fdf2 ("cifs: Change the I/O paths to use an iterator rather than a page list"). Since this commit represents a significant change with respect to CIFS, I need some help understanding how it affects RDMA. Note: issue seen with both iWARP and RoCE. ----------------------------------------------------------------------------------------------------------- Here is the testcase: Server - windows [machine info: windows server 2022, drivers: Chelsio 6.16.20.0, Mellanox 2.42.22627.0] Client - linux [kernel v6.11] on windows: -> two cifs shares were created in Windows as below (sh1 and sh2) PS C:\Windows\system32> Get-SmbShare Name ScopeName Path Description ---- --------- ---- ----------- sh1 * C:\share1 sh2 * C:\share2 try mount on linux: #mount -o rdma,username=<username>,password=<password> //102.1.1.222/share1 /mount1 ----------------------------------------------------------------------------------------------------------- iozone test error: [root@core mount1]# iozone -a -+d -I 2048 512 257447 266769 269650 266108 267467 262771 264120 265935 251500 3169944 3229534 2998435 3573472 2048 1024 294591 302869 311254 313434 314248 292156 Can not open temporary file for read open: Resource temporarily unavailable [root@core mount1]# dmesg from linux machine: [ 886.136758] CIFS: No dialect specified on mount. Default has changed to a more secure dialect, SMB2.1 or later (e.g. SMB3.1.1), from CIFS (SMB1). To use the less secure SMB1 dialect to access old servers which do not support SMB3.1.1 (or even SMB3 or SMB2.1) specify vers=1.0 on mount. [ 886.136767] CIFS: Attempting to mount //102.50.50.67/sh2 [ 886.136828] CIFS: smbd_conn_upcall: event->event 0 [ 886.136886] CIFS: smbd_conn_upcall: event->event 2 [ 886.142392] ib_core: ib_modify_mad: IB_WC_WR_FLUSH_ERR line 2480 [ 886.143248] CIFS: smbd_conn_upcall: event->event 8 [ 886.143314] CIFS: VFS: _smbd_get_connection:1634 rdma_connect failed port=5445 [ 886.146682] CIFS: smbd_conn_upcall: event->event 0 [ 886.146703] CIFS: smbd_conn_upcall: event->event 2 [ 886.151900] ib_core: ib_modify_mad: IB_WC_WR_FLUSH_ERR line 2480 [ 886.153598] CIFS: smbd_conn_upcall: event->event 9 [ 886.167331] CIFS: VFS: RDMA transport established [ 886.211612] CIFS: VFS: generate_smb3signingkey: dumping generated AES session keys [ 886.211619] CIFS: VFS: Session Id 05 00 00 00 00 e8 00 00 [ 886.211622] CIFS: VFS: Cipher type 2 [ 886.211624] CIFS: VFS: Session Key 13 54 03 6e 28 5f 85 8e 22 d1 37 31 c8 8f 3c 96 [ 886.211626] CIFS: VFS: Signing Key 31 8a d7 79 2d 66 fa 98 d5 85 05 9d 58 df 96 88 [ 886.211628] CIFS: VFS: ServerIn Key 45 10 c5 17 fa 30 fc 17 a2 17 5d 6c 20 cd 56 ac [ 886.211630] CIFS: VFS: ServerOut Key b1 ae 49 12 d6 5e c9 b0 16 d9 7d b5 8f 81 4f 30 [ 915.835418] CIFS: smbd_conn_upcall: event->event 10 [ 915.835434] CIFS: smbd_conn_upcall: info->transport_status 2 [ 915.835496] CIFS: smbd_recv_buf: info->transport_status 5 [ 915.835504] CIFS: VFS: smbd_recv_buf:1865 disconnected [ 915.835515] CIFS: smbd_destroy: info->transport_status 5 [ 915.836349] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 915.836563] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 915.836618] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 915.836754] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 915.836796] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 915.849813] CIFS: smbd_conn_upcall: event->event 0 [ 915.849858] CIFS: smbd_conn_upcall: event->event 2 [ 915.852292] ib_core: ib_modify_mad: IB_WC_WR_FLUSH_ERR line 2480 [ 915.852570] CIFS: smbd_conn_upcall: event->event 8 [ 915.852603] CIFS: VFS: _smbd_get_connection:1634 rdma_connect failed port=5445 [ 915.854699] CIFS: smbd_conn_upcall: event->event 0 [ 915.854725] CIFS: smbd_conn_upcall: event->event 2 [ 915.858993] ib_core: ib_modify_mad: IB_WC_WR_FLUSH_ERR line 2480 [ 915.859898] CIFS: smbd_conn_upcall: event->event 9 [ 915.872536] CIFS: VFS: RDMA transport re-established [ 915.874417] CIFS: VFS: generate_smb3signingkey: dumping generated AES session keys [ 915.874424] CIFS: VFS: Session Id 09 00 00 00 00 e8 00 00 [ 915.874427] CIFS: VFS: Cipher type 2 [ 915.874430] CIFS: VFS: Session Key 74 a5 36 e6 dd db 13 16 2b 84 1f a5 6f b2 85 1d [ 915.874433] CIFS: VFS: Signing Key 35 b4 96 d0 60 a5 74 2d ff c5 a6 ed c0 34 ab 1c [ 915.874436] CIFS: VFS: ServerIn Key 2b 7b d4 a0 67 d2 64 c3 20 df 6b 15 17 ae 00 b0 [ 915.874439] CIFS: VFS: ServerOut Key d6 a2 53 c2 a7 19 9f 40 81 34 2c ae 91 84 ba d9 [ 916.091622] CIFS: smbd_conn_upcall: event->event 10 [ 916.091631] CIFS: smbd_conn_upcall: info->transport_status 2 [ 916.091702] CIFS: smbd_recv_buf: info->transport_status 5 [ 916.091711] CIFS: VFS: smbd_recv_buf:1865 disconnected [ 916.091721] CIFS: smbd_destroy: info->transport_status 5 [ 916.092228] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 916.092334] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 916.092421] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 916.092502] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 916.092620] CIFS: smbd_disconnect_rdma_work: info->transport_status = SMBD_DISCONNECTING [ 916.102309] CIFS: smbd_conn_upcall: event->event 0 [ 916.102335] CIFS: smbd_conn_upcall: event->event 2 [ 916.103772] ib_core: ib_modify_mad: IB_WC_WR_FLUSH_ERR line 2480 [ 916.103960] CIFS: smbd_conn_upcall: event->event 8 [ 916.103973] CIFS: VFS: _smbd_get_connection:1634 rdma_connect failed port=5445 [ 916.105455] CIFS: smbd_conn_upcall: event->event 0 [ 916.105472] CIFS: smbd_conn_upcall: event->event 2 [ 916.108732] ib_core: ib_modify_mad: IB_WC_WR_FLUSH_ERR line 2480 [ 916.109355] CIFS: smbd_conn_upcall: event->event 9 [ 916.119471] CIFS: VFS: RDMA transport re-established [ 916.120926] CIFS: VFS: generate_smb3signingkey: dumping generated AES session keys [ 916.120934] CIFS: VFS: Session Id 0d 00 00 00 00 e8 00 00 [ 916.120938] CIFS: VFS: Cipher type 2 [ 916.120941] CIFS: VFS: Session Key 61 0c 9b e7 1c e2 bb 43 fe 18 23 03 45 ca f8 2d [ 916.120944] CIFS: VFS: Signing Key 09 b6 b2 f3 19 ab e1 65 d7 9f d4 31 8c 1f a9 64 [ 916.120946] CIFS: VFS: ServerIn Key 75 b9 a0 cc d0 7a 25 ef 71 84 44 e8 90 40 a1 cc [ 916.120949] CIFS: VFS: ServerOut Key 3a bd 5e fd 83 95 1e 12 58 d0 2b 1f 81 d9 22 97 Bisect logs: [root@beag linux]# git bisect start HEAD is now at 457391b03803 Linux 6.3 [root@beag linux]# git bisect bad 457391b0380335d5e9a5babdec90ac53928b23b4 [root@beag linux]# git bisect good 4ec5183ec48656cec489c49f989c508b68b518e3 Bisecting: 7399 revisions left to test after this (roughly 13 steps) [a5c95ca18a98d742d0a4a04063c32556b5b66378] Merge tag 'drm-next-2023-02-23' of git://anongit.freedesktop.org/drm/drm [root@beag linux]# git bisect bad Bisecting: 5053 revisions left to test after this (roughly 12 steps) [36289a03bcd3aabdf66de75cb6d1b4ee15726438] Merge tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 [root@beag linux]# git bisect good Bisecting: 2521 revisions left to test after this (roughly 11 steps) [0175ec3a28c695562a08fdccf73f2ec5ed744e2f] Merge tag 'regulator-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator [root@beag linux]# git bisect good Bisecting: 1260 revisions left to test after this (roughly 10 steps) [60b07cf5d3462ec0183d463b43619e98bc63c951] drm/amd/display: Make variables declaration inside ifdef guard [root@beag linux]# git bisect good Bisecting: 647 revisions left to test after this (roughly 9 steps) [6861eaf79155f0a5544ff989754159f806795c31] Merge tag 'ata-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata [root@beag linux]# git bisect good Bisecting: 306 revisions left to test after this (roughly 8 steps) [9fc2f99030b55027d84723b0dcbbe9f7e21b9c6c] Merge tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux [root@beag linux]# git bisect good Bisecting: 163 revisions left to test after this (roughly 7 steps) [535cd7104b4efacab3bf7e56b8ad263e1160a47f] Merge tag 'drm-msm-next-2023-01-30' of https://gitlab.freedesktop.org/drm/msm into drm-next [root@beag linux]# git bisect good Bisecting: 78 revisions left to test after this (roughly 6 steps) [5582f3c1b14e9b6eb02983acac84a4da71b38ca9] Merge tag 'drm-intel-next-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-next [root@beag linux]# git bisect good Bisecting: 39 revisions left to test after this (roughly 5 steps) [d08089f649a0cfb2099c8551ac47eef0cc23fdf2] cifs: Change the I/O paths to use an iterator rather than a page list [root@beag linux]# git bisect bad Bisecting: 19 revisions left to test after this (roughly 4 steps) [35235e19b393b54db0e0d7c424d658ba45f20468] cifs: Replace remaining 1-element arrays [root@beag linux]# git bisect good Bisecting: 9 revisions left to test after this (roughly 3 steps) [f62e52d1276b6cd329fe72d36bdf912b2ce4caaf] iov_iter: Define flags to qualify page extraction. [root@beag linux]# git bisect good Bisecting: 4 revisions left to test after this (roughly 2 steps) [0185846975339a5c348373aa450a977f5242366b] netfs: Add a function to extract an iterator into a scatterlist [root@beag linux]# git bisect good Bisecting: 2 revisions left to test after this (roughly 1 step) [39bc58203f040ebafbec48198a83c246b25eba99] cifs: Add a function to Hash the contents of an iterator [root@beag linux]# git bisect good Bisecting: 0 revisions left to test after this (roughly 1 step) [16541195c6d9bcad568b7c6afbf855ddc3a856aa] cifs: Add a function to read into an iter from a socket [root@beag linux]# git bisect good d08089f649a0cfb2099c8551ac47eef0cc23fdf2 is the first bad commit commit d08089f649a0cfb2099c8551ac47eef0cc23fdf2 Author: David Howells <dhowells@xxxxxxxxxx> Date: Mon Jan 24 21:13:24 2022 +0000 cifs: Change the I/O paths to use an iterator rather than a page list Currently, the cifs I/O paths hand lists of pages from the VM interface routines at the top all the way through the intervening layers to the socket interface at the bottom. This is a problem, however, for interfacing with netfslib which passes an iterator through to the ->issue_read() method (and will pass an iterator through to the ->issue_write() method in future). Netfslib takes over bounce buffering for direct I/O, async I/O and encrypted content, so cifs doesn't need to do that. Netfslib also converts IOVEC-type iterators into BVEC-type iterators if necessary. Further, cifs needs foliating - and folios may come in a variety of sizes, so a page list pointing to an array of heterogeneous pages may cause problems in places such as where crypto is done. Change the cifs I/O paths to hand iov_iter iterators all the way through instead. Notes: (1) Some old routines are #if'd out to be removed in a follow up patch so as to avoid confusing diff, thereby making the diff output easier to follow. I've removed functions that don't overlap with anything added. (2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and rq_tailsz which describe the pages forming the buffer; instead there's an rq_iter describing the source buffer and an rq_buffer which is used to hold the buffer for encryption. (3) struct cifs_readdata and cifs_writedata are similarly modified to smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then replaced with passing the iterator directly to the socket. The iterators are stored in these structs so that they are persistent and don't get deallocated when the function returns (unlike if they were stack variables). (4) Buffered writeback is overhauled, borrowing the code from the afs filesystem to gather up contiguous runs of folios. The XARRAY-type iterator is then used to refer directly to the pagecache and can be passed to the socket to transmit data directly from there. This includes: cifs_extend_writeback() cifs_write_back_from_locked_folio() cifs_writepages_region() cifs_writepages() (5) Pages are converted to folios. (6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type iterator from an IOBUF/UBUF-type source iterator. (7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page fragments from the iterator into the scatterlists that the crypto layer prefers. (8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an xarray, to use as a bounce buffer for encryption. An XARRAY-type iterator can then be used to pass the bounce buffer to lower layers. Signed-off-by: David Howells <dhowells@xxxxxxxxxx> Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx> Thanks, Showrya