I ran a git bisect and the performance drop came with commit d3de48233978524514d3b605ad55bb21d1ecd706 ----- Original Message ----- From: "ronnie sahlberg" <ronniesahlberg@xxxxxxxxx> To: "Steve French" <smfrench@xxxxxxxxx> Cc: "Pavel Shilovsky" <piastryyy@xxxxxxxxx>, "Aurélien Aptel" <aaptel@xxxxxxxx>, "CIFS" <linux-cifs@xxxxxxxxxxxxxxx> Sent: Thursday, 21 February, 2019 8:47:44 AM Subject: Re: xfstests and current cifs for-next patch set generic/013 and generic/014 failed with timeout. These tests used to take several minutes. Testing locally, generic/013 is successful but it takes almost three times longer than it used to just a week ago. On Thu, Feb 21, 2019 at 7:18 AM Steve French <smfrench@xxxxxxxxx> wrote: > > Looks like Pavel's latest fix (unrelated to credits it turns out, the > problem in this case was skipping a mid) does fix xfstest 310. Azure > test bucket passes, no reconnects that I spotted: > > http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/4/builds/94 > > Running cifs-testing buildbot bucket now. > > http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/2/builds/134 > > On Sun, Feb 17, 2019 at 1:30 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > Retrying the same test run it worked. Rerunning the same set of > > patches but this time with larger (cifs-testing) collection of tests > > on the buildbot > > > > On Sat, Feb 16, 2019 at 10:38 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > The test (310 and subsequent) seemed to start failing with this in dmesg: > > > > > > [root@fedora29 ~]# dmesg > > > [ 2969.016552] CIFS VFS: Cancelling wait for mid 29640 cmd: 14 > > > [ 2979.449426] CIFS VFS: disabling echoes and oplocks > > > [ 2999.109655] CIFS VFS: Cancelling wait for mid 1494 cmd: 6 > > > [ 3225.207488] CIFS VFS: Server > > > linuxsmb3testshares.file.core.windows.net has not responded in 120 > > > seconds. Reconnecting... > > > > > > On Sat, Feb 16, 2019 at 8:30 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > So (unless there is a random factor involved) - I narrowed it down to this patch > > > > > > > > Author: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx> > > > > Date: Wed Jan 16 11:12:41 2019 -0800 > > > > > > > > CIFS: Respect reconnect in MTU credits calculations > > > > > > > > Every time after a session reconnect we don't need to account for > > > > credits obtained in previous sessions. Introduce new struct cifs_credits > > > > which contains both credits value and reconnect instance of the > > > > time those credits were taken. Modify a routine that add credits > > > > back to handle the reconnect instance by assuming zero credits > > > > if the reconnect happened after the credits were obtained and > > > > before we decided to add them back due to some errors during sending. > > > > > > > > This patch fixes the MTU credits cases. The subsequent patch > > > > will handle non-MTU ones. > > > > > > > > Signed-off-by: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx> > > > > Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx> > > > > > > > > > > > > ---------- Forwarded message --------- > > > > From: Steve French <smfrench@xxxxxxxxx> > > > > Date: Sat, Feb 16, 2019 at 6:46 PM > > > > Subject: Re: xfstests and current cifs for-next patch set > > > > To: CIFS <linux-cifs@xxxxxxxxxxxxxxx> > > > > > > > > > > > > Narrowed the xfstest 310 possible regression in current for-next down > > > > to three patches, rerunning with this one of the three added (see > > > > http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/4/builds/85) > > > > > > > > Author: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx> > > > > Date: Wed Jan 16 11:12:41 2019 -0800 > > > > > > > > CIFS: Respect reconnect in MTU credits calculations > > > > > > > > On Sat, Feb 16, 2019 at 1:40 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > With 5.0-rc5 and current for-next (29 paches) two tests 310 (read and > > > > > readdir simultaneously) and 422 (delayed allocation stat, number of > > > > > blocks) fail I see this in the azure test bucket in the buildbot). > > > > > see this run: http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/4/builds/80 > > > > > > > > > > These don't fail when I select only the first 8 cifs fixes in for-next > > > > > ontop of 5.0-rc5. See > > > > > http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/4/builds/82 > > > > > so am trying to narrow it down. This run (in progress) > > > > > http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/4/builds/83 > > > > > has the first 19 (of the 29) cifs patches (ontop of 5.0-rc5 mainline > > > > > as with the runs above) so we can bisect which commit causes the > > > > > problem with tests 310 and 422. > > > > > > > > > > This seems unrelated to the problem I see in slightly more current > > > > > mainline (that we can see with no cifs changes) in xfstest 422 that > > > > > was introduced with 5.0-rc6. > > > > > > > > > > Let me know if others (or other scenario problems) see the tests > > > > > 310/422 failure. > > > > > > > > > > -- > > > > > Thanks, > > > > > > > > > > Steve > > > > > > > > > > > > > > > > -- > > > > Thanks, > > > > > > > > Steve > > > > > > > > > > > > -- > > > > Thanks, > > > > > > > > Steve > > > > > > > > > > > > -- > > > Thanks, > > > > > > Steve > > > > > > > > -- > > Thanks, > > > > Steve > > > > -- > Thanks, > > Steve