On Thu, Jan 13, 2022 at 03:39:39PM -0800, Peter Oskolkov wrote: > The original idea of a UMCG server was that it was used as a proxy > for a CPU, so if a worker associated with the server is RUNNING, > the server itself is never ever was allowed to be RUNNING as well; > when umcg_wait() returned for a server, it meant that its worker > became BLOCKED. > > In the new (old?) "per server runqueues" model implemented in > the previous patch in this patchset, servers are woken when > a previously blocked worker on their runqueue finishes its blocking > operation, even if the currently RUNNING worker continues running. > > As now a server may run while a worker assigned to it is running, > the original idea of having at most a single worker RUNNING per > server, as a means to control the number of running workers, is > not really enforced, and the server, woken by a worker > doing BLOCKED=>RUNNABLE transition, may then call sys_umcg_wait() > with a second/third/etc. worker to run. > > Support this scenario by adding a blocked worker list: > when a worker transitions RUNNING=>BLOCKED, not only its server > is woken, but the worker is also added to the blocked worker list > of its server. > > This change introduces the following benefits: > - block detection how behaves similarly to wake detection; > without this patch worker wakeups added wakees to the list > and woke the server, while worker blocks only woke the server > without adding blocked workers to a list, forcing servers > to explicitly check worker's state; > - if the blocked worker woke sufficiently quickly, the server > woken on the block event would observe its worker now as > RUNNABLE, so the block event had to be inferred rather than > explicitly signalled by the worker being added to the blocked > worker list; > - it is now possible for a single server to control several > RUNNING workers, which makes writing userspace schedulers > simpler for smaller processes that do not need to scale beyond > one "server"; > - if the userspace wants to keep at most a single RUNNING worker > per server, and have multiple servers with their own runqueues, > this model is also naturally supported here. > > So this change basically decouples block/wake detection from > M:N threading in the sense that the number of servers is now > does not have to be M or N, but is more driven by the scalability > needs of the userspace application. So I don't object to having this blocking list, we had that early on in the discussions. *However*, combined with WF_CURRENT_CPU this 1:N userspace model doesn't really make sense, also combined with Proxy-Exec (if we ever get that sorted) it will fundamentally not work. More consideration is needed I think...