On 2/24/20 11:47 AM, Jens Axboe wrote: > On 2/21/20 7:52 AM, Oleg Nesterov wrote: >> On 02/20, Peter Zijlstra wrote: >>> >>> On Thu, Feb 20, 2020 at 06:22:02PM +0100, Oleg Nesterov wrote: >>>> @@ -68,10 +65,10 @@ task_work_cancel(struct task_struct *task, task_work_func_t func) >>>> * we raced with task_work_run(), *pprev == NULL/exited. >>>> */ >>>> raw_spin_lock_irqsave(&task->pi_lock, flags); >>>> + for (work = READ_ONCE(*pprev); work; ) { >>>> if (work->func != func) >>>> pprev = &work->next; >>> >>> But didn't you loose the READ_ONCE() of *pprev in this branch? >> >> Argh, yes ;) >> >>>> @@ -97,16 +94,16 @@ void task_work_run(void) >>>> * work->func() can do task_work_add(), do not set >>>> * work_exited unless the list is empty. >>>> */ >>>> + work = READ_ONCE(task->task_works); >>>> do { >>>> head = NULL; >>>> if (!work) { >>>> if (task->flags & PF_EXITING) >>>> head = &work_exited; >>>> else >>>> break; >>>> } >>>> + } while (!try_cmpxchg(&task->task_works, &work, head)); >>>> >>>> if (!work) >>>> break; >>> >>> But given that, as you say, cancel() could have gone and stole our head, >>> should we not try and install &work_exiting when PF_EXITING in that >>> case? >> >> cancel() can't do this, as long as we use cmpxchg/try_cmpxchg, not xchg(). >> This is what the comment before lock/unlock below tries to explain. >> >>> That is; should we not do continue in that case, instead of break. >> >> This is what we should do if we use xchg() like your previous version did. >> Or I am totally confused. Hmm, and when I re-read my words it looks as if >> I am trying to confuse you. >> >> So lets "simplify" this code assuming that PF_EXITING is set: >> >> work = READ_ONCE(task->task_works); >> do { >> head = NULL; >> if (!work) >> head = &work_exited; >> } while (!try_cmpxchg(&task->task_works, &work, head)); >> >> if (!work) >> break; >> >> If work == NULL after try_cmpxchg() _succeeds_, then the new "head" must >> be work_exited and we have nothing to do. >> >> If it was nullified by try_cmpxchg(&work) because we raced with cancel_(), >> then this try_cmpxchg() should have been failed. >> >> Right? >> >>> @@ -69,9 +68,12 @@ task_work_cancel(struct task_struct *tas >>> */ >>> raw_spin_lock_irqsave(&task->pi_lock, flags); >>> while ((work = READ_ONCE(*pprev))) { >>> - if (work->func != func) >>> + if (work->func != func) { >>> pprev = &work->next; >>> - else if (cmpxchg(pprev, work, work->next) == work) >>> + continue; >>> + } >>> + >>> + if (try_cmpxchg(pprev, &work, work->next)) >>> break; >> >> perhaps I misread this code, but it looks a bit strange to me... it doesn't >> differ from >> >> while ((work = READ_ONCE(*pprev))) { >> if (work->func != func) >> pprev = &work->next; >> else if (try_cmpxchg(pprev, &work, work->next)) >> break; >> } >> >> either way it is correct, the only problem is that we do not need (want) >> another READ_ONCE() if try_cmpxchg() fails. >> >>> void task_work_run(void) >>> { >>> struct task_struct *task = current; >>> - struct callback_head *work, *head, *next; >>> + struct callback_head *work, *next; >>> >>> for (;;) { >>> - /* >>> - * work->func() can do task_work_add(), do not set >>> - * work_exited unless the list is empty. >>> - */ >>> - do { >>> - head = NULL; >>> - work = READ_ONCE(task->task_works); >>> - if (!work) { >>> - if (task->flags & PF_EXITING) >>> - head = &work_exited; >>> - else >>> - break; >>> - } >>> - } while (cmpxchg(&task->task_works, work, head) != work); >>> + work = READ_ONCE(task->task_works); >>> + if (!work) { >>> + if (!(task->flags & PF_EXITING)) >>> + return; >>> + >>> + /* >>> + * work->func() can do task_work_add(), do not set >>> + * work_exited unless the list is empty. >>> + */ >>> + if (try_cmpxchg(&task->task_works, &work, &work_exited)) >>> + return; >>> + } >>> + >>> + work = xchg(&task->task_works, NULL); >>> + if (!work) >>> + continue; >> >> looks correct... > > Peter/Oleg, as you've probably noticed, I'm still hauling Oleg's > original patch around. Is the above going to turn into a separate patch > on top? If so, feel free to shove it my way as well for some extra > testing. Peter/Oleg, gentle ping on this query. I'd like to queue up the task poll rework on the io_uring side, but I still have this one at the start of the series: https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-task-poll&id=3b668ecf75f94f40c1faf9688ba3f32fb5e9f5d0 -- Jens Axboe