On Sun, 18 Mar 2012 22:23:47 +0400 Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote: > 18 марта 2012 г. 14:50 пользователь Jeff Layton <jlayton@xxxxxxxxxx> написал: > > On Sat, 17 Mar 2012 10:20:59 -0500 > > Steve French <smfrench@xxxxxxxxx> wrote: > > > >> On Sat, Mar 17, 2012 at 9:53 AM, Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote: > >> > 17 марта 2012 г. 15:12 пользователь Jeff Layton <jlayton@xxxxxxxxxx> написал: > >> >> On Fri, 16 Mar 2012 18:09:24 +0300 > >> >> Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote: > >> >> > >> >>> Some servers sets this value less than 50 that was hardcoded and > >> >>> we lost the connection if when we exceed this limit. Fix this by > >> >>> respecting this value - not sending more than the server allows. > >> >>> > >> >>> Signed-off-by: Pavel Shilovsky <piastry@xxxxxxxxxxx> > >> >>> --- > >> >>> fs/cifs/cifsfs.c | 8 ++++---- > >> >>> fs/cifs/cifsglob.h | 10 +++------- > >> >>> fs/cifs/cifssmb.c | 9 +++++++-- > >> >>> fs/cifs/connect.c | 11 ++++------- > >> >>> fs/cifs/dir.c | 6 ++++-- > >> >>> fs/cifs/file.c | 4 ++-- > >> >>> fs/cifs/transport.c | 4 ++-- > >> >> > >> >> Introducing a behavior change like this at the beginning of the series > >> >> is probably a mistake. You'll have no reasonable way to bisect down > >> >> regressions, so you won't know if a problem is due to the change to a > >> >> credit-based model or due to changing the client to respect the maxmpx. > >> >> > >> >> Instead of doing this, we should instead reorganize the code around a > >> >> credit based model first, while attempting to mimic the existing > >> >> behavior as closely as possible. Once that framework is in place, then > >> >> change the behavior and start respecting the maxmpx. > >> >> > >> >> If you do that, then someone can reasonably bisect between those two > >> >> changes and we can determine the source of a regression. > >> >> -- > >> >> Jeff Layton <jlayton@xxxxxxxxxx> > >> > > >> > This was done to push this patch to 3.3 and stable tree as well. > >> > That's why it is in the start of series. > >> > > >> > If we decide not to push it for now, I agree that this patch should be > >> > in the modified version (according to changes from other patches) at > >> > the end of the series. > >> > > >> > Steve, your thoughts? > >> > > >> > -- > >> > Best regards, > >> > Pavel Shilovsky. > >> > >> We want to fix the reported (cifs) problem first with a simple patch > >> (that would be suitable for stable too). > >> > >> The most important reported problems are mounts to servers > >> running some versions of Windows Vista and some versions of Windows 7 > >> which set maxmpx below 50 and more importantly have maxworkitems > >> (in the windows registry on the server) set at too small a value to handle > >> the 20 or 30 async reads or writes which are now often in flight to them. > >> > >> The fix we need first is to honor the maxmpx that the server sets (and > >> probably remove cifs_max_pending which is obsolete as > >> a global module parm or simply do the minimum of cifs_max_pending > >> and maxmpx as the earlier patch did). > >> If desired (to workaround buggy > >> servers) in a second patch we could add a mount parm if we think there are > >> servers such as windows 7 for which we will need to set the > >> negotiated value lower (due to their maxworkitems registry > >> problem), or Samba for which maxmpx could be ignored > >> (and as JRA and others indicated - we could send > >> a thousand requests in parallel). > >> > > > > By the way... > > > > What's the plan for blocking locks and echoes here? Do you plan to make > > them just ignore the maxmpx limit? > > I think we shouldn't play with them for stable. We still don't know > what exactly we should do with them - really risky to make experiments > on stable kernels. > Then I'm confused, since you said this in your earlier email: > This was done to push this patch to 3.3 and stable tree as well. > That's why it is in the start of series. > If you're not worrying about stable (and I agree that it's premature to do that here), then you should focus on converting everything to use the new credit-based scheme first and only then introduce the behavior change of respecting maxmpx. > So, as we are sure that exceeding negotiated maxmpx value with any > request except blocking lock and echo is wrong, I suggest to fix this > - it is what this patch (#1) does. When we get more information about > blocking locks and echos we can fix this too. > Well, according to what MS wrote back to Steve when he asked, exceeding the maxmpx is wrong _period_, regardless of the call. Echoes are fairly easy to deal with, but blocking locks are another matter entirely. I believe we need a fundamental rethink there. I think we need to change blocking lock calls to respect the maxmpx, use a small timeout (3-5s or so) and call it in a loop. That would allow other calls to "slip in", though it will suck for lock "fairness". That's a separate problem though. For the initial pass here, I think we should just continue to ignore the limit for blocking locks. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html