2012/2/15 Pavel Shilovsky <piastry@xxxxxxxxxxx>: > 2012/2/15 Jeff Layton <jlayton@xxxxxxxxx>: >> On Wed, 15 Feb 2012 18:18:11 +0400 >> Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote: >> >>> 2012/2/12 Jeff Layton <jlayton@xxxxxxxxx>: >>> > -----BEGIN PGP SIGNED MESSAGE----- >>> > Hash: SHA1 >>> > >>> > On Thu, 9 Feb 2012 11:41:24 +0300 >>> > Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote: >>> > >>> >> and send no more than credits value requests at once. For SMB/CIFS >>> >> it's trivial: increment this value by receiving any message and >>> >> decrement by sending one. >>> >> >>> >> Signed-off-by: Pavel Shilovsky <piastry@xxxxxxxxxxx> >>> > >>> > Yes! Nice work. This is clearly the right approach for merging SMB2 >>> > support into the kernel. Extend the CIFS code to do what we need and >>> > share as much code as possible between the SMB1 and SMB2 support. By >>> > doing this piecemeal too we can avoid a wholesale "code dump" into the >>> > kernel. >>> > >>> > Some comments below. >>> > >>> >> --- >>> >> fs/cifs/cifsglob.h | 1 + >>> >> fs/cifs/cifssmb.c | 3 +++ >>> >> fs/cifs/connect.c | 10 ++++------ >>> >> fs/cifs/transport.c | 20 ++++++++++++++++---- >>> >> 4 files changed, 24 insertions(+), 10 deletions(-) >>> >> >>> >> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h >>> >> index 76e7d8b..f96752a 100644 >>> >> --- a/fs/cifs/cifsglob.h >>> >> +++ b/fs/cifs/cifsglob.h >>> >> @@ -256,6 +256,7 @@ struct TCP_Server_Info { >>> >> bool noautotune; /* do not autotune send buf sizes */ >>> >> bool tcp_nodelay; >>> >> atomic_t inFlight; /* number of requests on the wire to server */ >>> >> + atomic_t credits; /* send no more requests at once */ >>> > >>> > Currently, it's not clear whether the inFlight counter is protected by >>> > the GlobalMid_lock or not. Ditto for this new counter. If it is >>> > protected by it, then there's no need for it to be an atomic value. The >>> > locking around this needs to be explicitly clear to avoid later bugs >>> > when we tinker with this code. >>> >>> We decrement inFlight value without holding GlobalMid_lock - so it >>> should be atomic. >>> >> >> But why is it this way? If it's atomic why bother with the locking at >> all? If you really need the lock then why bother making it atomic? >> >> Any time I see inconsistent locking like this, it's usually an indicator >> that something is wrong. If it's not wrong then it needs to be >> documented at the very least. >> >> My suspicion is that the existing code was slapped together without >> much thought as to what locks were necessary. Since you're in here now >> though, this ought to be corrected before you move forward with adding >> new code. > > Ok, it seems like a bug. Locking it with GlobalMid_lock looks better > to me (the explanation is below). > >> >>> > >>> > Also, should we just replace the inFlight counter with the credits >>> > counter? Is there some way to calculate the # of requests in flight given >>> > the value of "credits" in the SMB2 case? >>> >>> I don't think so. >>> >>> > >>> >> struct mutex srv_mutex; >>> >> struct task_struct *tsk; >>> >> char server_GUID[16]; >>> >> diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c >>> >> index 8b7794c..fbc4437 100644 >>> >> --- a/fs/cifs/cifssmb.c >>> >> +++ b/fs/cifs/cifssmb.c >>> >> @@ -717,6 +717,7 @@ cifs_echo_callback(struct mid_q_entry *mid) >>> >> >>> >> DeleteMidQEntry(mid); >>> >> atomic_dec(&server->inFlight); >>> >> + atomic_inc(&server->credits); >>> >> wake_up(&server->request_q); >>> >> } >>> >> >>> >> @@ -1670,6 +1671,7 @@ cifs_readv_callback(struct mid_q_entry *mid) >>> >> queue_work(system_nrt_wq, &rdata->work); >>> >> DeleteMidQEntry(mid); >>> >> atomic_dec(&server->inFlight); >>> >> + atomic_inc(&server->credits); >>> >> wake_up(&server->request_q); >>> >> } >>> >> >>> >> @@ -2111,6 +2113,7 @@ cifs_writev_callback(struct mid_q_entry *mid) >>> >> queue_work(system_nrt_wq, &wdata->work); >>> >> DeleteMidQEntry(mid); >>> >> atomic_dec(&tcon->ses->server->inFlight); >>> >> + atomic_inc(&tcon->ses->server->credits); >>> >> wake_up(&tcon->ses->server->request_q); >>> >> } >>> >> >>> >> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c >>> >> index 602f77c..6b95304 100644 >>> >> --- a/fs/cifs/connect.c >>> >> +++ b/fs/cifs/connect.c >>> >> @@ -648,12 +648,8 @@ static void clean_demultiplex_info(struct TCP_Server_Info *server) >>> >> * time to as little as two. >>> >> */ >>> >> spin_lock(&GlobalMid_Lock); >>> >> - if (atomic_read(&server->inFlight) >= cifs_max_pending) >>> >> - atomic_set(&server->inFlight, cifs_max_pending - 1); >>> >> - /* >>> >> - * We do not want to set the max_pending too low or we could end up >>> >> - * with the counter going negative. >>> >> - */ >>> >> + if (atomic_read(&server->credits) == 0) >>> >> + atomic_set(&server->credits, 1); >>> >> spin_unlock(&GlobalMid_Lock); >>> >> /* >>> >> * Although there should not be any requests blocked on this queue it >>> >> @@ -3759,9 +3755,11 @@ int cifs_negotiate_protocol(unsigned int xid, struct cifs_ses *ses) >>> >> if (server->maxBuf != 0) >>> >> return 0; >>> >> >>> >> + atomic_set(&ses->server->credits, cifs_max_pending); >>> >> rc = CIFSSMBNegotiate(xid, ses); >>> >> if (rc == -EAGAIN) { >>> >> /* retry only once on 1st time connection */ >>> >> + atomic_set(&ses->server->credits, cifs_max_pending); >>> >> rc = CIFSSMBNegotiate(xid, ses); >>> >> if (rc == -EAGAIN) >>> >> rc = -EHOSTDOWN; >>> >> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c >>> >> index 0cc9584..f1f927f 100644 >>> >> --- a/fs/cifs/transport.c >>> >> +++ b/fs/cifs/transport.c >>> >> @@ -260,17 +260,17 @@ static int wait_for_free_request(struct TCP_Server_Info *server, >>> >> if (long_op == CIFS_ASYNC_OP) { >>> >> /* oplock breaks must not be held up */ >>> >> atomic_inc(&server->inFlight); >>> >> + atomic_dec(&server->credits); >>> >> return 0; >>> >> } >>> >> >>> > >>> > I haven't gotten to your later patches so I may be reviewing >>> > prematurely, but I think we need to rethink the code above. Allowing >>> > the client to exceed the maxmpx is clearly problematic. Rather than >>> > doing that we ought to attempt to keep 2 credits in reserve at any >>> > given time -- one for oplock breaks and one for echo requests. >>> > >>> > If MaxMpx 2, then we should disable oplocks. If it's 1, then we >>> > should disable echoes and oplocks. The latter case may take some overhaul >>> > of how the echo timeout handling works. >>> >>> I agree that allowing to exceed the maxmps is wrong, but I think we >>> can leave it as it is for this patch. >>> >> >> I suppose, but I worry that adding the new complexity of credits >> without fixing that bug is going to make this really difficult to >> debug... >> >>> > >>> >> spin_lock(&GlobalMid_Lock); >>> >> while (1) { >>> >> - if (atomic_read(&server->inFlight) >= cifs_max_pending) { >>> >> + if (atomic_read(&server->credits) == 0) { >>> >>> Now I think that (according to echo and oplock problem) we need this to be: >>> >>> if (atomic_read(&server->credits) <= 0) { >>> >>> Right? >>> >> >> Ugh, I guess. Is this fixed later in this patchset? If not, then I >> suggest fixing that problem first before you introduce the new >> credit-based algorithm. Fixing this problem is really my immediate goal >> with this patchset. If we can do it in a way that also paves the way >> for the for SMB2 code then that's ideal. >> > > I understand. In SMB2 we have the dynamic MaxMpx value - so, we should > disable/enable echo/oplock on the fly. > > I suggest to do this for SMB2 on every arrival request: > > 1) lock GlobalMid_lock > 2) update credits and reqOnFlight values > 3) check if outstanding requests exist - goto 5 (can't make a desicion > about server configuration). > 4) change server configuration: > 4.1) if credits >= 3: enable oplocks & echos > 4.2) if credits == 2: disable oplocks & enable echos > 4.3) if credits == 1: disable oplocks & disable echos > (credits should not be 0 here; if so - reconnect) > 5) unlock GlobalMid_lock > > But for CIFS we should do something like step 4 only once after > getting MaxMpx value. > > Thoughts? > > -- > Best regards, > Pavel Shilovsky. I started to think that this approach doesn't work. We should have at least echo_retries number of slots for current echo configuration, because we make a decision to reconnect or not according to echo_retries number of echo requests without response. As for oplocks: the client can receive more than one oplock break at the time - so, need more than one slot for it. But in this case we can't figure out what number actually use. For the first implementation, I suggest to keep only echo_retries reserved slots for echos. Now, I am not sure if we should keep any number of credits for oplock breaks. Thoughts? -- Best regards, Pavel Shilovsky. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html