Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> writes: > On 19/10/2022 21:19, Jason A. Donenfeld wrote: >> On Wed, Oct 19, 2022 at 09:09:28PM +0100, Tvrtko Ursulin wrote: >>> Hm why is kthread_stop() after kthread_run() abuse? I don't see it in >>> kerneldoc that it must not be used for stopping threads. >> Because you don't want it to stop. You want to wait until it's done. If >> you call stop right after run, it will even stop it before it even >> begins to run. That's why you wind up sprinkling your msleeps >> everywhere, indicating that clearly this is not meant to work that way. > Not after kthread_run which wakes it up already. If the kerneldoc for > kthread_stop() is correct at least... In which case I really do think > that the yields are pointless/red herring. Perhaps they predate kthread_run and > then they were even wrong. > >>> Yep the yields and sleeps are horrible and will go. But they are also >>> not relevant for the topic at hand. >> Except they very much are. The reason you need these is because you're >> using kthread_stop() for something it's not meant to do. > > It is supposed to assert kthread_should_stop() which thread can look at as when > to exit. Except that now it can fail to get to that controlled exit > point. Granted that argument is moot since it implies incomplete error handling > in the thread anyway. > > Btw there are actually two use cases in our code base. One is thread controls > the exit, second is caller controls the exit. Anyway... > >>> Never mind, I was not looking for anything more than a suggestion on how >>> to maybe work around it in piece as someone is dealing with the affected >>> call sites. >> Sultan's kthread_work idea is probably the right direction. This would >> seem to have what you need. > > ... yes, it can be converted. Even though for one of the two use cases we need > explicit signalling. There now isn't anything which would assert > kthread_should_stop() without also asserting the signal, right?. Neither > I found that the thread work API can do it. > > Fingers crossed we were the only "abusers" of the API. There's a quite a number > of kthread_stop callers and it would be a large job to audit them all. I have been out and am coming to this late. Did this get resolved? I really don't expect this affected much of anything else as the code sat in linux-next for an entire development cycle before being merged. But I would like to make certain problems with this change were resolved. Thank you, Eric