Re: [PATCH] cifs: try to pick channel with a minimum of credits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Discussed this with Aurelien.

> I'm only saying that on getting a EDEADLK error from
> compound_send_recv(), try another channel instead of returning that to
> userspace.
We both agreed that this will be a cleaner way to deal with the issue.
However, he pointed to me that the code changes will be more than what
I initially thought.
That needs more investigation.

The most likely case to hit the problem of EDEADLK is when the first
request on the channel is bigger than what we need.
The proposed fix should avoid that for basic requests by switching
channels (pending testing).
However, a large read/write could still result in EDEADLK error being returned.

Regards,
Shyam

On Thu, Mar 11, 2021 at 4:19 PM Shyam Prasad N <nspmangalore@xxxxxxxxx> wrote:
>
> > Some code relies on server->* values so if you swap the server pointer
> > it at the last moment I'm not sure those values will match the new
> > server ptr.
>
> I'm not sure that I understand this. I'm not suggesting that we swap.
> I'm only saying that on getting a EDEADLK error from
> compound_send_recv(), try another channel instead of returning that to
> userspace.
> Please let me know if I'm missing something here.
>
> Regards,
> Shyam
>
> On Mon, Mar 8, 2021 at 5:22 PM Aurélien Aptel <aaptel@xxxxxxxx> wrote:
> >
> > Hi Shyam,
> >
> > Thanks for the review. I also realized we cannot make this failproof so
> > I went in with the idea to just avoid picking easy, non-confusing cases
> > of unusable channels. If that's not good we can drop the patch all
> > together.
> >
> > Shyam Prasad N <nspmangalore@xxxxxxxxx> writes:
> > > Spent some time in this code path. Seems like this is more complicated
> > > than I initially thought.
> > > @Aurélien Aptel A few things to consider:
> > > 1. What if the credits that will be needed by the request is more than
> > > 3 (or any constant). We may still end up returning -EDEADLK to the
> > > user when we don't find enough credits, and there are not enough
> > > in_flight to satisfy the request. Ideally, we should still try other
> > > channels.
> >
> > Yes you're right, it won't prevent failing if more credits are
> > needed. The patch wasn't meant to be failproof, just to avoid most
> > occurences of the problem. It's a simple sanity check with some
> > false-positives and false-negatives.
> >
> > > 2. Echo and oplock use 1 reserved credit each, which the regular
> > > operations can steal, in case of shortage.
> >
> > There's a dedicated server->echo_credit I think.
> >
> > > 3. Reading server->credits without a lock can result in wildly
> > > different values, since some CPU architectures may not update it
> > > atomically. At worse, we may select a channel which may not have
> > > enough credits when we get to it
> >
> > If we are just doing a read on an aligned int, at least on x86 you will
> > get either a stale value or the new value, you cannot get a garbage mix
> > of both. Which CPU architecture doesn't provide cache coherency at that
> > level? This is a complex question I couldn't find an easy answer to.
> >
> > In any case cifs.ko is already assuming it's valid in various places. We
> > are doing it for some usage of the server->tcpStatus, ses->status,
> > tcon->tidStatus at least.
> >
> > The problems of atomic read in pick_channel() aside, the credits might
> > change from another process between the moment the channel is picked and
> > the moment the credits needed are spent (server->credit -= XYZ). In
> > which case it will also EDEADLK later on.
> >
> > > What if we do this...
> > > wait_for_compound_request() can return -EDEADLK when it doesn't get
> > > enough credits, and there are no requests in_flight.
> > > In compound_send_recv(), after we call wait_for_compound_request(), we
> > > can check it's return value. If it's -EDEADLK, we keep calling
> > > cifs_pick_another_channel(ses, bitmask) (bitmask has bits set for
> > > channels already selected and rejected) and
> > > wait_for_compound_request() in a loop till we find a channel which has
> > > enough credits, or we run out of channels; in which case we can return
> > > -EDEADLK.
> > > Makes sense? Do you see a problem with that approach?
> >
> > Some code relies on server->* values so if you swap the server pointer
> > it at the last moment I'm not sure those values will match the new
> > server ptr.
> >
> > Cheers,
> > --
> > Aurélien Aptel / SUSE Labs Samba Team
> > GPG: 1839 CB5F 9F5B FB9B AA97  8C99 03C8 A49B 521B D5D3
> > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE
> > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)
> >
>
>
> --
> Regards,
> Shyam



-- 
Regards,
Shyam




[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux