Re: mounts to Windows XP server do have problems with async write due to inFlight > 10

Steve French <smfrench@xxxxxxxxx> · Sat, 25 Jun 2011 14:21:16 -0500

Potential problem writing large files (with the async write support
which kicks off
more smbs in parallel)  to systems which set an artificially low limit on maxmpx
(not to the usual 50, but to e.g. 10 as some WindowsXP does) and enforce
the limit strictly by killing the session or throwing away the request.

The patch below did seem to help, but when setting wsize to 8K or 4K
I still did run into a problem, although
seemed to be better - so need to debug whether it is an "off by one"
problem where server (as Chris Hertel suggested) might be setting the
limit on number of writes in parallel to one less than the mpxcount
rather than mpxcount (when oplock enabled on the server) or perhaps whether
we just have too many async write requests blocked and hit some sort
of memory issue or corruption.

On Sat, Jun 25, 2011 at 5:27 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> On Fri, 24 Jun 2011 20:45:08 -0500
> Steve French <smfrench@xxxxxxxxx> wrote:
>
>> This seems to be close to what we would want but am not done testing/debugging:
>>
>
>> diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
>> index 1a9fe7f..1957f5d 100644
>> --- a/fs/cifs/cifssmb.c
>> +++ b/fs/cifs/cifssmb.c
>> @@ -453,6 +453,11 @@ CIFSSMBNegotiate(unsigned int xid, struct cifs_ses *ses)
>>               }
>>               server->sec_mode = (__u8)le16_to_cpu(rsp->SecurityMode);
>>               server->maxReq = le16_to_cpu(rsp->MaxMpxCount);
>> +             /* maxReq must be at least 2, almost all servers set to 50,
>> +                some old non-MS servers set incorrectly so set to default
>> +                (otherwise oplock break e.g. would fail */
>> +             if (server->maxReq < 2)
>> +                     server->maxReq = CIFS_MAX_REQ; /* 50 */
>                ^^^^^^^^
> Would it be better to set this to the minimum (2 or 3), and possibly
> warn on the first mount?

since the "normal" number used by almost all Windows (and even OS/2)
is 50, and was the value we have been using for many years, by default,
I thought that defaulting to the "normal" limit on an illegal value (which
means we don't change behavior) is safer.

>
>>               server->maxBuf = min((__u32)le16_to_cpu(rsp->MaxBufSize),
>>                               (__u32)CIFSMaxBufSize + MAX_CIFS_HDR_SIZE);
>>               server->max_vcs = le16_to_cpu(rsp->MaxNumberVcs);
>> @@ -560,6 +565,11 @@ CIFSSMBNegotiate(unsigned int xid, struct cifs_ses *ses)
>>       /* one byte, so no need to convert this or EncryptionKeyLen from
>>          little endian */
>>       server->maxReq = le16_to_cpu(pSMBr->MaxMpxCount);
>> +     /* maxReq must be at least 2, almost all servers set to 50,
>> +        some old non-MS servers set incorrectly so set to default
>> +        (otherwise oplock break e.g. would fail */
>> +     if (server->maxReq < 2)
>> +             server->maxReq = CIFS_MAX_REQ; /* 50 */
>
>                ^^^^^^^^
>                ditto.
>
>>       /* probably no need to store and check maxvcs */
>>       server->maxBuf = min(le32_to_cpu(pSMBr->MaxBufferSize),
>>                       (__u32) CIFSMaxBufSize + MAX_CIFS_HDR_SIZE);
>> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
>> index c761935..3f8cf63 100644
>> --- a/fs/cifs/connect.c
>> +++ b/fs/cifs/connect.c
>> @@ -95,7 +95,7 @@ cifs_reconnect(struct TCP_Server_Info *server)
>>       spin_unlock(&GlobalMid_Lock);
>>       server->maxBuf = 0;
>>
>> -     cFYI(1, "Reconnecting tcp session");
>> +     cERROR(1, "Reconnecting tcp session in flight %d",
>
>        ^^^^^^^^^^^
> I don't think this ought to be a cERROR.

Right - that was just a debug statement I threw in there to
try to figure out what Windows XP was doing to our sessions
(that line will stay a cFYI)

>> atomic_read(&server->inFlight));  /* BB FIXME */
>>
>>       /* before reconnecting the tcp session, mark the smb session (uid)
>>               and the tid bad so they are not used until reconnected */
>> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
>> index 147aa22..11c6585 100644
>> --- a/fs/cifs/transport.c
>> +++ b/fs/cifs/transport.c
>> @@ -264,7 +264,12 @@ static int wait_for_free_request(struct
>> TCP_Server_Info *server,
>>
>>       spin_lock(&GlobalMid_Lock);
>>       while (1) {
>> -             if (atomic_read(&server->inFlight) >= cifs_max_pending) {
>> +             /* We must check that don't exceed maxReq but on SMB Negotiate
>> +                it is not initted yet (maxReq must be at least two) */
>> +             if ((atomic_read(&server->inFlight) >= cifs_max_pending) ||
>> +                 ((atomic_read(&server->inFlight) >= server->maxReq) &&
>> +                   (server->maxReq > 1))) {
>
> Why continue to have a cifs_max_pending limit if you're going to
> respect maxReq? This should be a per-socket limit only, IMO...

Your point makes sense - and the memory usage of these structures
isn't that great - but it does give us additional control (to shrink
maxmpx if the server exhibits performance problems or perhaps even
if the client exhibits memory limits when many, many mounts each
with lots of pending requests).   My main reason for not wanting
to touch the cifs_max_pending (and thus not yet removing
the module_param) is to keep the patch smaller, and not change
external interfaces this close to release.

>>                       spin_unlock(&GlobalMid_Lock);
>>  #ifdef CONFIG_CIFS_STATS2
>>                       atomic_inc(&server->num_waiters);
>>
>> On Fri, Jun 24, 2011 at 7:22 PM, Steve French <smfrench@xxxxxxxxx> wrote:
>> > Should be fairly easy to fix to honor a "reasonable" mpx value (ie
>> > over 3 but under 50) - to avoid the problems to WindowsXP caused by
>> > async write kicking off more than 10 writes in parallel and exceeding
>> > maxMpx (the server kills the session

-- 
Thanks,

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html