On Thu, Jun 22, 2023 at 4:07 PM Tom Talpey <tom@xxxxxxxxxx> wrote: > > On 6/22/2023 2:42 AM, Shyam Prasad N wrote: > > On Thu, Jun 22, 2023 at 10:26 AM Steve French <smfrench@xxxxxxxxx> wrote: > >> > >> On Wed, Jun 21, 2023 at 1:25 PM Tom Talpey <tom@xxxxxxxxxx> wrote: > >>> > >>> On 6/20/2023 11:57 PM, Steve French wrote: > >>>>> Why this value of 10? I would go with 1, since we already reserve 1 > >>>> credit for oplocks. If the reasoning is to have enough credits to send multiple > >>>> lease/oplock acks, we should change the reserved count altogether. > >>>> > >>>> I think there could be some value in sending multiple lease break > >>>> responses (ie allow oplock credits to be a few more than 1), but my > >>>> main reasoning for this was to pick some number that was "safe" > >>>> (allowing 10 oplock/lease-break credits while in flight count is > >>>> non-zero is unlikely to be a problem) and would be unlikely to change > >>>> existing behavior. > >>>> > >>>> My thinking was that today's code allows oplock credits to be above 1 > >>>> (and keep growing in the server scenario you noticed) while multiple > >>>> requests continue to be in flight - so there could potentially be a > >>>> performance benefit during this period of high activity in having a > >>>> few lease breaks in flight at one time and unlikely to hurt anything - > >>>> but more importantly if we change the code to never allow oplock/lease > >>>> credits to be above one we could (unlikely but possible) have subtle > >>>> behavior changes that trigger a bug (since we would then have cases to > >>>> at least some servers where we never have two lease breaks in flight). > >>>> It seemed harmless to set the threshold to something slightly more > >>>> than one (so multiple lease breaks in flight would still be possible > >>>> and thus behavior would not change - but risk of credit starvation is > >>>> gone). If you prefer - I could pick a number like 2 or 3 credits > >>>> instead of 10. My intent was just to make it extremely unlikely that > >>>> any behavior would change (but would still fix the possible credit > >>>> starvation scenario) - so 2 or 3 would also probably be fine. > >>> > >>> What do you mean by "oplock credits"? There's no such field in the > >>> protocol. Is this some sort of reserved pool > >> > >> The client divides the total credits granted by the server into three > >> buckets (see struct TCP_Server_Info) > >> int echo_credits; /* echo reserved slots */ > >> int oplock_credits; /* lease/oplock break reserved slots */ > >> int credits; /* credits reserved for all other operation types */ > >> > >> If we run low on credits we can disable (temporarily) leases and > >> sending echo requests so we can continue to send other requests (open, > >> read, write, close etc.). As an example, if the server has granted > >> us 512 credits (total) if there were 4 large writes that were > >> responded to very slowly (and used up all of our credits), we could > >> time out if the write responses were very slow - since we would have > >> no way of sending an echo request periodically to see if the server > >> were still alive) - since we have 1 credit reserved for echo requests > >> though, even if the responses to the writes were slow, we would be > >> able to ping the server with an echo request to make sure it is still > >> alive. Similarly (and this can be very important with some servers > >> who could hold up granting credits if a lease break is pending) - we > >> have to be able to respond to a lease break even if all of our credits > >> are used up with large reads or writes so we reserve at least one > >> credit for sending lease break responses. > >> > >> The easiest way to think about this is that we reserve 1 credit for > >> echo and 1 credit for leases (although it can grow larger as we saw in > >> some server scenarios when they grant us more credits on lease break > >> responses than we asked for), but all the reset (all other remaining > >> credits) are available to send read or write or open or close (etc.). > >> When requests in flight goes back to zero we rebalance credits to > >> make sure we still have 1 reserved for echo and 1 reserved for > >> oplock/lease break responses (and everything else for the other > >> operation types). > >> > >> The scenario we were having a problem with though was one in which > >> requests inflight stayed above 0 for a long time and the server often > >> granted more credits than we asked for on lease break responses - this > >> caused the number of oplock/lease credits reserved to be larger than > >> the amount of credits for everything else and eventually starved the > >> client credits needed for normal operations (this would normally not > >> be an issue but the number of requests in flight stayed above zero for > >> a long time which kept us from rebalancing credits, and moving credits > >> back into the main credit pool from the oplock/lease break reserved > >> category - which could significantly hurt performance). > >> > >> There is normally not a problem with having more than one credit in > >> the lease break pool, but when it grows particularly large it could > >> hurt performance (or even hang). > >> > >> > >> > >> > >> > >>> If so, I really don't think any constant is appropriate. If the client > >>> can't calculate an expected number, we should keep it quite small. > > > > Hi Steve, > > > > During response handling, in case oplock_credits reach 0, we anyhow > > have it stealing one credit from regular credits today... > > else if (server->in_flight > 0 && server->oplock_credits == 0 && > > server->oplocks) { > > if (server->credits > 1) { > > server->credits--; > > server->oplock_credits++; > > } > > } > > > > So I don't think we need to reserve more than 1 credits for > > oplock_credits at all. > > I think the behaviour would not be affected by restricting oplock_credits to 1. Current version of the patch in for-next has the threshold at 3 not 1 for the maximum oplock/lease credits - presumably that is low enough, and current behavior won't change much but we avoid the credit starvation. > Good! And I really really hope we don't ever set echo_credits higher > than 1 either. I see no point in parallelizing the nearly-useless echo > procedure. I am not aware of any cases (or reported bugs) where echo credits would go higher. echo is just to periodically check if server is unresponsive/hung and to reduce chance of reconnect issues. It is rarely sent (every 60 seconds on inactive connections, and echo interval can be set higher on mount if you prefer) -- Thanks, Steve