Re: [PATCH v2 01/11] CIFS: Respect negotiated MaxMpxCount

Steve French <smfrench@xxxxxxxxx> · Mon, 19 Mar 2012 14:32:17 -0500

On Mon, Mar 19, 2012 at 2:04 PM, Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
> 19 марта 2012 г. 19:04 пользователь Jeff Layton <jlayton@xxxxxxxxxx> написал:
>> On Sun, 18 Mar 2012 22:23:47 +0400
>> Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
>>
>>> 18 марта 2012 г. 14:50 пользователь Jeff Layton <jlayton@xxxxxxxxxx> написал:
>>> > On Sat, 17 Mar 2012 10:20:59 -0500
>>> > Steve French <smfrench@xxxxxxxxx> wrote:
>>> >
>>> >> On Sat, Mar 17, 2012 at 9:53 AM, Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
>>> >> > 17 марта 2012 г. 15:12 пользователь Jeff Layton <jlayton@xxxxxxxxxx> написал:
>>> >> >> On Fri, 16 Mar 2012 18:09:24 +0300
>>> >> >> Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
>>> >> >>
>>> >> >>> Some servers sets this value less than 50 that was hardcoded and
>>> >> >>> we lost the connection if when we exceed this limit. Fix this by
>>> >> >>> respecting this value - not sending more than the server allows.
>>> >> >>>
>>> >> >>> Signed-off-by: Pavel Shilovsky <piastry@xxxxxxxxxxx>
>>> >> >>> ---
>>> >> >>>  fs/cifs/cifsfs.c    |    8 ++++----
>>> >> >>>  fs/cifs/cifsglob.h  |   10 +++-------
>>> >> >>>  fs/cifs/cifssmb.c   |    9 +++++++--
>>> >> >>>  fs/cifs/connect.c   |   11 ++++-------
>>> >> >>>  fs/cifs/dir.c       |    6 ++++--
>>> >> >>>  fs/cifs/file.c      |    4 ++--
>>> >> >>>  fs/cifs/transport.c |    4 ++--
>>> >> >>
>>> >> >> Introducing a behavior change like this at the beginning of the series
>>> >> >> is probably a mistake. You'll have no reasonable way to bisect down
>>> >> >> regressions, so you won't know if a problem is due to the change to a
>>> >> >> credit-based model or due to changing the client to respect the maxmpx.
>>> >> >>
>>> >> >> Instead of doing this, we should instead reorganize the code around a
>>> >> >> credit based model first, while attempting to mimic the existing
>>> >> >> behavior as closely as possible. Once that framework is in place, then
>>> >> >> change the behavior and start respecting the maxmpx.
>>> >> >>
>>> >> >> If you do that, then someone can reasonably bisect between those two
>>> >> >> changes and we can determine the source of a regression.
>>> >> >> --
>>> >> >> Jeff Layton <jlayton@xxxxxxxxxx>
>>> >> >
>>> >> > This was done to push this patch to 3.3 and stable tree as well.
>>> >> > That's why it is in the start of series.
>>> >> >
>>> >> > If we decide not to push it for now, I agree that this patch should be
>>> >> > in the modified version (according to changes from other patches) at
>>> >> > the end of the series.
>>> >> >
>>> >> > Steve, your thoughts?
>>> >> >
>>> >> > --
>>> >> > Best regards,
>>> >> > Pavel Shilovsky.
>>> >>
>>> >> We want to fix the reported (cifs) problem first with a simple patch
>>> >> (that would be suitable for stable too).
>>> >>
>>> >> The most important reported problems are mounts to servers
>>> >> running some versions of Windows Vista and some versions of Windows 7
>>> >> which set maxmpx below 50 and more importantly have maxworkitems
>>> >> (in the windows registry on the server) set at too small a value to handle
>>> >> the 20 or 30 async reads or writes which are now often in flight to them.
>>> >>
>>> >> The fix we need first is to honor the maxmpx that the server sets (and
>>> >> probably remove cifs_max_pending which is obsolete as
>>> >> a global module parm or simply do the minimum of cifs_max_pending
>>> >> and maxmpx as the earlier patch did).
>>> >> If desired (to workaround buggy
>>> >> servers) in a second patch we could add a mount parm if we think there are
>>> >> servers such as windows 7 for which we will need to set the
>>> >> negotiated value lower (due to their maxworkitems registry
>>> >> problem), or Samba for which maxmpx could be ignored
>>> >> (and as JRA and others indicated - we could send
>>> >> a thousand requests in parallel).
>>> >>
>>> >
>>> > By the way...
>>> >
>>> > What's the plan for blocking locks and echoes here? Do you plan to make
>>> > them just ignore the maxmpx limit?
>>>
>>> I think we shouldn't play with them for stable. We still don't know
>>> what exactly we should do with them - really risky to make experiments
>>> on stable kernels.
>>>
>>
>> Then I'm confused, since you said this in your earlier email:
>>
>>> This was done to push this patch to 3.3 and stable tree as well.
>>> That's why it is in the start of series.
>>>
>>
>> If you're not worrying about stable (and I agree that it's premature to
>> do that here), then you should focus on converting everything to use
>> the new credit-based scheme first and only then introduce the behavior
>> change of respecting maxmpx.
>>
>>> So, as we are sure that exceeding negotiated maxmpx value with any
>>> request except blocking lock and echo is wrong, I suggest to fix this
>>> - it is what this patch (#1) does. When we get more information about
>>> blocking locks and echos we can fix this too.
>>>
>>
>> Well, according to what MS wrote back to Steve when he asked, exceeding
>> the maxmpx is wrong _period_, regardless of the call. Echoes are fairly
>> easy to deal with, but blocking locks are another matter entirely. I
>> believe we need a fundamental rethink there.
>
> I meant that playing with things like blocking locks and echos is too
> risky for stable. But if we can fix the existing stable problem (not
> respecting MaxMpxCount value by other commands) with a strait-forward
> patch like this - we should do it.

yes. agreed

-- 
Thanks,

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html