On 2020/6/11 10:45, Dave Chinner wrote:
From: Dave Chinner <dchinner@xxxxxxxxxx> xlog_wait() on the CIL context can reference a freed context if the waiter doesn't get scheduled before the CIL context is freed. This can happen when a task is on the hard throttle and the CIL push aborts due to a shutdown. This was detected by generic/019: thread 1 thread 2 __xfs_trans_commit xfs_log_commit_cil <CIL size over hard throttle limit> xlog_wait schedule xlog_cil_push_work wake_up_all <shutdown aborts commit> xlog_cil_committed kmem_free remove_wait_queue spin_lock_irqsave --> UAF Fix it by moving the wait queue to the CIL rather than keeping it in in the CIL context that gets freed on push completion. Because the wait queue is now independent of the CIL context and we might have multiple contexts in flight at once, only wake the waiters on the push throttle when the context we are pushing is over the hard throttle size threshold.
Hi, Dave, How do you think about the following fix: 1. use autoremove_wake_func(), and remove remove_wait_queue() to avoid UAF. 2. add finish_wait(). @@ -576,12 +576,13 @@ xlog_wait( __releases(lock) { DECLARE_WAITQUEUE(wait, current); + wait.func = autoremove_wake_function; add_wait_queue_exclusive(wq, &wait); __set_current_state(TASK_UNINTERRUPTIBLE); spin_unlock(lock); schedule(); - remove_wait_queue(wq, &wait); + finish_wait(wq, &wait); } Best regards! Yu Kuai