Re: [PATCH 1/6] CIFS: Introduce credit-based flow control

Pavel Shilovsky <piastry@xxxxxxxxxxx> · Wed, 15 Feb 2012 22:04:11 +0400

2012/2/15 Jeff Layton <jlayton@xxxxxxxxx>:
> On Wed, 15 Feb 2012 18:18:11 +0400
> Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
>
>> 2012/2/12 Jeff Layton <jlayton@xxxxxxxxx>:
>> > -----BEGIN PGP SIGNED MESSAGE-----
>> > Hash: SHA1
>> >
>> > On Thu,  9 Feb 2012 11:41:24 +0300
>> > Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
>> >
>> >> and send no more than credits value requests at once. For SMB/CIFS
>> >> it's trivial: increment this value by receiving any message and
>> >> decrement by sending one.
>> >>
>> >> Signed-off-by: Pavel Shilovsky <piastry@xxxxxxxxxxx>
>> >
>> > Yes! Nice work. This is clearly the right approach for merging SMB2
>> > support into the kernel. Extend the CIFS code to do what we need and
>> > share as much code as possible between the SMB1 and SMB2 support. By
>> > doing this piecemeal too we can avoid a wholesale "code dump" into the
>> > kernel.
>> >
>> > Some comments below.
>> >
>> >> ---
>> >>  fs/cifs/cifsglob.h  |    1 +
>> >>  fs/cifs/cifssmb.c   |    3 +++
>> >>  fs/cifs/connect.c   |   10 ++++------
>> >>  fs/cifs/transport.c |   20 ++++++++++++++++----
>> >>  4 files changed, 24 insertions(+), 10 deletions(-)
>> >>
>> >> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
>> >> index 76e7d8b..f96752a 100644
>> >> --- a/fs/cifs/cifsglob.h
>> >> +++ b/fs/cifs/cifsglob.h
>> >> @@ -256,6 +256,7 @@ struct TCP_Server_Info {
>> >>       bool noautotune;                /* do not autotune send buf sizes */
>> >>       bool tcp_nodelay;
>> >>       atomic_t inFlight;  /* number of requests on the wire to server */
>> >> +     atomic_t credits;  /* send no more requests at once */
>> >
>> > Currently, it's not clear whether the inFlight counter is protected by
>> > the GlobalMid_lock or not. Ditto for this new counter. If it is
>> > protected by it, then there's no need for it to be an atomic value. The
>> > locking around this needs to be explicitly clear to avoid later bugs
>> > when we tinker with this code.
>>
>> We decrement inFlight value without holding GlobalMid_lock - so it
>> should be atomic.
>>
>
> But why is it this way? If it's atomic why bother with the locking at
> all? If you really need the lock then why bother making it atomic?
>
> Any time I see inconsistent locking like this, it's usually an indicator
> that something is wrong. If it's not wrong then it needs to be
> documented at the very least.
>
> My suspicion is that the existing code was slapped together without
> much thought as to what locks were necessary. Since you're in here now
> though, this ought to be corrected before you move forward with adding
> new code.

Ok, it seems like a bug. Locking it with GlobalMid_lock looks better
to me (the explanation is below).

>
>> >
>> > Also, should we just replace the inFlight counter with the credits
>> > counter? Is there some way to calculate the # of requests in flight given
>> > the value of "credits" in the SMB2 case?
>>
>> I don't think so.
>>
>> >
>> >>       struct mutex srv_mutex;
>> >>       struct task_struct *tsk;
>> >>       char server_GUID[16];
>> >> diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
>> >> index 8b7794c..fbc4437 100644
>> >> --- a/fs/cifs/cifssmb.c
>> >> +++ b/fs/cifs/cifssmb.c
>> >> @@ -717,6 +717,7 @@ cifs_echo_callback(struct mid_q_entry *mid)
>> >>
>> >>       DeleteMidQEntry(mid);
>> >>       atomic_dec(&server->inFlight);
>> >> +     atomic_inc(&server->credits);
>> >>       wake_up(&server->request_q);
>> >>  }
>> >>
>> >> @@ -1670,6 +1671,7 @@ cifs_readv_callback(struct mid_q_entry *mid)
>> >>       queue_work(system_nrt_wq, &rdata->work);
>> >>       DeleteMidQEntry(mid);
>> >>       atomic_dec(&server->inFlight);
>> >> +     atomic_inc(&server->credits);
>> >>       wake_up(&server->request_q);
>> >>  }
>> >>
>> >> @@ -2111,6 +2113,7 @@ cifs_writev_callback(struct mid_q_entry *mid)
>> >>       queue_work(system_nrt_wq, &wdata->work);
>> >>       DeleteMidQEntry(mid);
>> >>       atomic_dec(&tcon->ses->server->inFlight);
>> >> +     atomic_inc(&tcon->ses->server->credits);
>> >>       wake_up(&tcon->ses->server->request_q);
>> >>  }
>> >>
>> >> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
>> >> index 602f77c..6b95304 100644
>> >> --- a/fs/cifs/connect.c
>> >> +++ b/fs/cifs/connect.c
>> >> @@ -648,12 +648,8 @@ static void clean_demultiplex_info(struct TCP_Server_Info *server)
>> >>        * time to as little as two.
>> >>        */
>> >>       spin_lock(&GlobalMid_Lock);
>> >> -     if (atomic_read(&server->inFlight) >= cifs_max_pending)
>> >> -             atomic_set(&server->inFlight, cifs_max_pending - 1);
>> >> -     /*
>> >> -      * We do not want to set the max_pending too low or we could end up
>> >> -      * with the counter going negative.
>> >> -      */
>> >> +     if (atomic_read(&server->credits) == 0)
>> >> +             atomic_set(&server->credits, 1);
>> >>       spin_unlock(&GlobalMid_Lock);
>> >>       /*
>> >>        * Although there should not be any requests blocked on this queue it
>> >> @@ -3759,9 +3755,11 @@ int cifs_negotiate_protocol(unsigned int xid, struct cifs_ses *ses)
>> >>       if (server->maxBuf != 0)
>> >>               return 0;
>> >>
>> >> +     atomic_set(&ses->server->credits, cifs_max_pending);
>> >>       rc = CIFSSMBNegotiate(xid, ses);
>> >>       if (rc == -EAGAIN) {
>> >>               /* retry only once on 1st time connection */
>> >> +             atomic_set(&ses->server->credits, cifs_max_pending);
>> >>               rc = CIFSSMBNegotiate(xid, ses);
>> >>               if (rc == -EAGAIN)
>> >>                       rc = -EHOSTDOWN;
>> >> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
>> >> index 0cc9584..f1f927f 100644
>> >> --- a/fs/cifs/transport.c
>> >> +++ b/fs/cifs/transport.c
>> >> @@ -260,17 +260,17 @@ static int wait_for_free_request(struct TCP_Server_Info *server,
>> >>       if (long_op == CIFS_ASYNC_OP) {
>> >>               /* oplock breaks must not be held up */
>> >>               atomic_inc(&server->inFlight);
>> >> +             atomic_dec(&server->credits);
>> >>               return 0;
>> >>       }
>> >>
>> >
>> > I haven't gotten to your later patches so I may be reviewing
>> > prematurely, but I think we need to rethink the code above. Allowing
>> > the client to exceed the maxmpx is clearly problematic. Rather than
>> > doing that we ought to attempt to keep 2 credits in reserve at any
>> > given time -- one for oplock breaks and one for echo requests.
>> >
>> > If MaxMpx 2, then we should disable oplocks. If it's 1, then we
>> > should disable echoes and oplocks. The latter case may take some overhaul
>> > of how the echo timeout handling works.
>>
>> I agree that allowing to exceed the maxmps is wrong, but I think we
>> can leave it as it is for this patch.
>>
>
> I suppose, but I worry that adding the new complexity of credits
> without fixing that bug is going to make this really difficult to
> debug...
>
>> >
>> >>       spin_lock(&GlobalMid_Lock);
>> >>       while (1) {
>> >> -             if (atomic_read(&server->inFlight) >= cifs_max_pending) {
>> >> +             if (atomic_read(&server->credits) == 0) {
>>
>> Now I think that (according to echo and oplock problem) we need this to be:
>>
>> if (atomic_read(&server->credits) <= 0) {
>>
>> Right?
>>
>
> Ugh, I guess. Is this fixed later in this patchset? If not, then I
> suggest fixing that problem first before you introduce the new
> credit-based algorithm. Fixing this problem is really my immediate goal
> with this patchset. If we can do it in a way that also paves the way
> for the for SMB2 code then that's ideal.
>

I understand. In SMB2 we have the dynamic MaxMpx value - so, we should
disable/enable echo/oplock on the fly.

I suggest to do this for SMB2 on every arrival request:

1) lock GlobalMid_lock
2) update credits and reqOnFlight values
3) check if outstanding requests exist - goto 5 (can't make a desicion
about server configuration).
4) change server configuration:
  4.1) if credits >= 3: enable oplocks & echos
  4.2) if credits == 2: disable oplocks & enable echos
  4.3) if credits == 1: disable oplocks & disable echos
(credits should not be 0 here; if so - reconnect)
5) unlock GlobalMid_lock

But for CIFS we should do something like step 4 only once after
getting MaxMpx value.

Thoughts?

-- 
Best regards,
Pavel Shilovsky.
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html