On 4/7/20 4:39 AM, Oleg Nesterov wrote: > On 04/06, Jens Axboe wrote: >> >> +extern struct callback_head task_work_exited; >> + >> static inline void >> init_task_work(struct callback_head *twork, task_work_func_t func) >> { >> @@ -19,7 +21,7 @@ void __task_work_run(void); >> >> static inline bool task_work_pending(void) >> { >> - return current->task_works; >> + return current->task_works && current->task_works != &task_work_exited; > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Well, this penalizes all the current users, they can't hit work_exited. > > IIUC, this is needed for the next change which adds task_work_run() into > io_ring_ctx_wait_and_kill(), right? Right - so you'd rather I localize that check there instead? Can certainly do that. > could you explain how the exiting can call io_ring_ctx_wait_and_kill() > after it passed exit_task_work() ? Sure, here's a trace where it happens: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD 0 P4D 0 Oops: 0010 [#1] SMP CPU: 51 PID: 7290 Comm: mc_worker Kdump: loaded Not tainted 5.2.9-03696-gf2db01aa1e97 #190 Hardware name: Quanta Leopard ORv2-DDR4/Leopard ORv2-DDR4, BIOS F06_3B17 03/16/2018 RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9002721bc78 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff82d10ff0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffc9002721bc60 RDI: ffffffff82d10ff0 RBP: ffff889fd220e8f0 R08: 0000000000000000 R09: ffffffff812f1000 R10: ffff88bfa5fcb100 R11: 0000000000000000 R12: ffff889fd220e200 R13: ffff889fd220e92c R14: ffffffff82d10ff0 R15: 0000000000000000 FS: 00007f03161ff700(0000) GS:ffff88bfff9c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000002409004 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __task_work_run+0x66/0xa0 io_ring_ctx_wait_and_kill+0x14e/0x3c0 io_uring_release+0x1c/0x20 __fput+0xaa/0x200 __task_work_run+0x66/0xa0 do_exit+0x9cf/0xb40 do_group_exit+0x3a/0xa0 get_signal+0x152/0x800 do_signal+0x36/0x640 ? __audit_syscall_exit+0x23c/0x290 exit_to_usermode_loop+0x65/0x100 do_syscall_64+0xd4/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f77057fe8ce Code: Bad RIP value. RSP: 002b:00007f03161f8960 EFLAGS: 00000293 ORIG_RAX: 000000000000002e RAX: 000000000000002a RBX: 00007f03161f8a30 RCX: 00007f77057fe8ce RDX: 0000000000004040 RSI: 00007f03161f8a30 RDI: 00000000000057a4 RBP: 00007f03161f8980 R08: 0000000000000000 R09: 00007f03161fb7b8 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000004040 R14: 00007f02dc12bc00 R15: 00007f02dc21b900 -- Jens Axboe