On Thu, 25 Jan 2024 at 10:30, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > There's a problem with the tasklet API - there is no reliable way how to > free a structure that contains tasklet_struct. The problem is that the > function tasklet_action_common calls task_unlock(t) after it called the > callback. If the callback does something that frees tasklet_struct, > task_unlock(t) would write into free memory. Ugh. I see what you're doing, but I have to say, I dislike this patch immensely. It feels like a serious misdesign that is then papered over with a hack. I'd much rather see us trying to move away from tasklets entirely in cases like this. Just say "you cannot do that". In fact, of the two cases that want this new functionality, at least dm-verity already makes tasklets a conditional feature that isn't even enabled by default, and that was only introduced in the last couple of years. So I think dm-verity would be better off just removing tasklet use, and we should check whether there are better models for handling the latency issue. The dm-crypt.c case looks different, but similar. I'm not sure why it doesn't just use the workqueue for the "in interrupt" case. Like dm-verity, it already does have a workqueue option, and it's a setup-time option to say "don't use the workqueue for reads / writes". But it feels like the code should just say "tough luck, in interrupt context we *will* use workqueues". So honestly, both of the cases you bring up seem to be just BUGGY. The fix is not to extend tasklets to a new thing, the fix is to say "those two uses of tasklets were broken, and should go away". End result: I would suggest: - just get rid of the actively buggy use of tasklets. It's not necessary in either case. - look at introducing a "low-latency atomic workqueue" that looks *exactly* like a regular workqueue, but has the rule that it's per-cpu and functions on it cannot sleep because I think one common issue with workqueues - which are better designed than tasklets - is that scheduling latency. I think if we introduced a workqueue that worked more like a tasklet - in that it's run in softirq context - but doesn't have the interface mistakes of tasklets, a number of existing workqueue users might decide that that is exactly what they want. So we could have a per-cpu 'atomic_wq' that things can be scheduled on, and that runs from softirqs just like tasklets, and shares the workqueue queueing infrastructure but doesn't use the workqueue threads. Yes, the traditional use of workqueues is to be able to sleep and do things in process context, so that sounds a bit odd, but let's face it, we (a) already have multiple classes of workqueues (b) avoiding deep - and possibly recursive - stack depths is another reason people use workqueues (c) avoiding interrupt context is a real concern, even if you don't want to sleep and I really *really* would like to get rid of tasklets entirely. They started as this very specific hardcoded softirq thing used by some drivers, and then the notion was generalized. And I think it was generalized badly, as shown by this example. I have added Tejun to the cc, so that he can throw his hands up in horror and say "Linus, you're crazy, your drug-fueled idea would be horrid because of Xyz". But *maybe* Tejun has been taking the same drugs I have, and goes "yeah, that would fit well". Tejun? Please tell me I'm not on some bad crack.. Linus