On Fri, Jul 23, 2021 at 12:06 PM Thierry Delisle <tdelisle@xxxxxxxxxxxx> wrote: > > > In my tests reclaimed nodes have their next pointers immediately set > > to point to the list head. If the kernel gets a node with its @next > > pointing to something else, then yes, things break down (the kernel > > kills the process); this has happened occasionally when I had a bug in > > the userspace code. > > I believe that approach is fine for production, but for testing it may > not detect some bugs. For example, it may not detect the race I detail > below. While I think I have the idle servers list working, I now believe that what peterz@ was suggesting is not much slower in the common case (many idle workers; few, if any, idle servers) than having a list of idle servers exposed to the kernel: I think having a single idle server at head, not a list, is enough: when a worker is added to idle workers list, a single idle server at head, if present, can be "popped" and woken; the userspace can maintain the list of idle servers itself; having the kernel wake only one is enough - it will pop all idle workers and decide whether any other servers are needed to process the newly available work. [...] > > Workers are trickier, as they can be woken by signals and then block > > again, but stray signals are so bad here that I'm thinking of actually > > not letting sleeping workers wake on signals. Other than signals > > waking queued/unqueued idle workers, are there any other potential > > races here? > > Timeouts on blocked threads is virtually the same as a signal I think. I > can see that both could lead to attempts at waking workers that are not > blocked. I've got preemption working well enough to warrant a new RFC patchset (also have timeouts done, but these were easy). I'll clean things up, change the idle servers logic to only one idle server exposed to the kernel, not a list, add some additional documentation (state transitions, userspace code snippets, etc.) and will post v0.4 RFC patchset to LKML later this week. [...]