When a subrequest is marked for needing retry, netfs will call cifs_prepare_write() which will make cifs repick the server for the op before renegotiating credits; it then calls cifs_issue_write() which invokes smb2_async_writev() - which re-repicks the server. If a different server is then selected, this causes the increment of server->in_flight to happen against one record and the decrement to happen against another, leading to misaccounting. Fix this by just removing the repick code in smb2_async_writev(). As this is only called from netfslib-driven code, cifs_prepare_write() should always have been called first, and so server should never be NULL and the preparatory step is repeated in the event that we do a retry. The problem manifests as a warning looking something like: WARNING: CPU: 4 PID: 72896 at fs/smb/client/smb2ops.c:97 smb2_add_credits+0x3f0/0x9e0 [cifs] ... RIP: 0010:smb2_add_credits+0x3f0/0x9e0 [cifs] ... smb2_writev_callback+0x334/0x560 [cifs] cifs_demultiplex_thread+0x77a/0x11b0 [cifs] kthread+0x187/0x1d0 ret_from_fork+0x34/0x60 ret_from_fork_asm+0x1a/0x30 Which may be triggered by a number of different xfstests running against an Azure server in multichannel mode. generic/249 seems the most repeatable, but generic/215, generic/249 and generic/308 may also show it. Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib") Reported-by: Steve French <sfrench@xxxxxxxxx> Signed-off-by: David Howells <dhowells@xxxxxxxxxx> Reviewed-by: Paulo Alcantara (Red Hat) <pc@xxxxxxxxxxxxx> cc: Jeff Layton <jlayton@xxxxxxxxxx> cc: Aurelien Aptel <aaptel@xxxxxxxx> cc: linux-cifs@xxxxxxxxxxxxxxx cc: netfs@xxxxxxxxxxxxxxx cc: linux-fsdevel@xxxxxxxxxxxxxxx --- fs/smb/client/smb2pdu.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index 2ae2dbb6202b..bb84a89e5905 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -4859,9 +4859,6 @@ smb2_async_writev(struct cifs_io_subrequest *wdata) struct cifs_io_parms *io_parms = NULL; int credit_request; - if (!wdata->server || test_bit(NETFS_SREQ_RETRYING, &wdata->subreq.flags)) - server = wdata->server = cifs_pick_channel(tcon->ses); - /* * in future we may get cifs_io_parms passed in from the caller, * but for now we construct it here...