I agree this is a good idea to retry if a channel is out of credits. The proposed solution should work for all operations but writepages. For writepages there is an internal logic that marks pages and mapping with errors: 2412 if (!wdata->cfile) { 2413 cifs_dbg(VFS, "No writable handle in writepages rc=%d\n", 2414 get_file_rc); 2415 if (is_retryable_error(get_file_rc)) 2416 rc = get_file_rc; 2417 else 2418 rc = -EBADF; 2419 } else 2420 rc = wdata_send_pages(wdata, nr_pages, mapping, wbc); 2421 2422 for (i = 0; i < nr_pages; ++i) 2423 unlock_page(wdata->pages[i]); 2424 2425 /* send failure -- clean up the mess */ 2426 if (rc != 0) { 2427 add_credits_and_wake_if(server, &wdata->credits, 0); 2428 for (i = 0; i < nr_pages; ++i) { 2429 if (is_retryable_error(rc)) 2430 redirty_page_for_writepage(wbc, 2431 wdata->pages[i]); 2432 else 2433 SetPageError(wdata->pages[i]); 2434 end_page_writeback(wdata->pages[i]); 2435 put_page(wdata->pages[i]); 2436 } 2437 if (!is_retryable_error(rc)) 2438 mapping_set_error(mapping, rc); 2439 } What I think we should do instead is to include -EDEADLK into the set of retryable errors conditionally to multi-channel being enabled and let cifs_writepages itself use its internal retry mechanism. -- Best regards, Pavel Shilovsky ср, 28 апр. 2021 г. в 08:35, Steve French <smfrench@xxxxxxxxx>: > > We have a simple repro we are looking at now with 4 channels > > We start with reasonable number of credits but end up with channels 3 > and 4 reconnecting and only having one credit on reconnect > > On Wed, Apr 28, 2021 at 10:23 AM Aurélien Aptel <aaptel@xxxxxxxx> wrote: > > > > Shyam Prasad N <nspmangalore@xxxxxxxxx> writes: > > > The idea is to retry on -EDEADLK (which we return when we run out of > > > credits and we have no requests in flight), so that another channel is > > > chosen for the next request. > > > > I agree with the idea. > > > > Have you been able to test those DEADLK codepaths? If so how, because we > > should add buildbot tests I think. > > > > For the function renaming, there is a precedent in the kernel to use a _ > > prefix for "sub-functions" instead of the _final suffix. > > > > > This fix mostly introduces a wrapper for all functions which call > > > cifs_pick_channel. In the wrapper function, the function is retried > > > when the error is -EDEADLK, and uses multichannel. > > > > I think we should put a limit on the number of retries. If it goes above > > some reasonable number print a warning and return EDEADLK. > > > > We could also hide the retry logic loop in a macro to avoid code > > duplication when possible. This could get rid of multiple simpler funcs > > I think if we use the macro in the caller. > > > > Something like (feel free to improve/modify): > > > > #define MULTICHAN_RETRY(chan_count, call) \ > > ({ \ > > int __rc; \ > > int __tries = 0; > > do { \ > > __rc = call; \ > > __tries++; \ > > } while (__tries < MULTICHAN_MAX_RETRY && __rc == -EDEADLK && chan_count > 1) \ > > WARN_ON(__tries == MULTICHAN_MAX_RETRY); \ > > __rc; \ > > }) > > > > > > > Signed-off-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx> > > > --- > > > fs/cifs/file.c | 96 +++++++++++- > > > fs/cifs/smb2inode.c | 93 ++++++++---- > > > fs/cifs/smb2ops.c | 113 +++++++++++++- > > > fs/cifs/smb2pdu.c | 359 ++++++++++++++++++++++++++++++++++++++++---- > > > 4 files changed, 593 insertions(+), 68 deletions(-) > > > > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > > > index 7e97aeabd616..7609b9739ce4 100644 > > > --- a/fs/cifs/file.c > > > +++ b/fs/cifs/file.c > > > @@ -2378,7 +2378,7 @@ wdata_send_pages(struct cifs_writedata *wdata, unsigned int nr_pages, > > > return rc; > > > } > > > > > > -static int cifs_writepages(struct address_space *mapping, > > > +static int cifs_writepages_final(struct address_space *mapping, > > > struct writeback_control *wbc) > > > { > > > struct inode *inode = mapping->host; > > > @@ -2543,6 +2543,21 @@ static int cifs_writepages(struct address_space *mapping, > > > return rc; > > > } > > > > > > +static int cifs_writepages(struct address_space *mapping, > > > + struct writeback_control *wbc) > > > +{ > > > + int rc; > > > + struct inode *inode = mapping->host; > > > + struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); > > > + struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb); > > > + > > > + do { > > > + rc = cifs_writepages_final(mapping, wbc); > > > + } while (tcon->ses->chan_count > 1 && rc == -EDEADLK); > > > + > > > + return rc; > > > +} > > > > cifs_writepages can issue multiple writes. I suspect it can do some > > successful writes before it runs out of credit, and I'm not sure if > > individual pages are marked as flushed or not. Basically this func might > > not be idempotent so that's one codepath that we should definitely try I > > think (unless someone knows more and can explain me). > > > > Same goes for /some/ of the the other funcs... I think this should be > > documented/tried. > > > > > > +static int > > > +smb2_compound_op(const unsigned int xid, struct cifs_tcon *tcon, > > > + struct cifs_sb_info *cifs_sb, const char *full_path, > > > + __u32 desired_access, __u32 create_disposition, > > > + __u32 create_options, umode_t mode, void *ptr, int command, > > > + bool check_open, bool writable_path) > > > +{ > > > > I would use an int for flags intead of passing those 2 booleans. This > > way adding more flags in the future will be easier, and you can use > > named enum or defines in the code instead of true/false which will be > > more understandable. > > > > Cheers, > > -- > > Aurélien Aptel / SUSE Labs Samba Team > > GPG: 1839 CB5F 9F5B FB9B AA97 8C99 03C8 A49B 521B D5D3 > > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München) > > > > > -- > Thanks, > > Steve