On Thu, Jul 30, 2020 at 02:35:07PM -0700, Samuel Mendoza-Jonas wrote: > From: Rik van Riel <riel@xxxxxxxxxxx> > > commit cdea5459ce263fbc963657a7736762ae897a8ae6 upstream > > The code in xlog_wait uses the spinlock to make adding the task to > the wait queue, and setting the task state to UNINTERRUPTIBLE atomic > with respect to the waker. > > Doing the wakeup after releasing the spinlock opens up the following > race condition: > > Task 1 task 2 > add task to wait queue > wake up task > set task state to UNINTERRUPTIBLE > > This issue was found through code inspection as a result of kworkers > being observed stuck in UNINTERRUPTIBLE state with an empty > wait queue. It is rare and largely unreproducable. > > Simply moving the spin_unlock to after the wake_up_all results > in the waker not being able to see a task on the waitqueue before > it has set its state to UNINTERRUPTIBLE. > > This bug dates back to the conversion of this code to generic > waitqueue infrastructure from a counting semaphore back in 2008 > which didn't place the wakeups consistently w.r.t. to the relevant > spin locks. > > [dchinner: Also fix a similar issue in the shutdown path on > xc_commit_wait. Update commit log with more details of the issue.] > > Fixes: d748c62367eb ("[XFS] Convert l_flushsema to a sv_t") > Reported-by: Chris Mason <clm@xxxxxx> > Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # 4.9.x-4.19.x > [modified for contextual change near xlog_state_do_callback()] > Signed-off-by: Samuel Mendoza-Jonas <samjonas@xxxxxxxxxx> > Reviewed-by: Frank van der Linden <fllinden@xxxxxxxxxx> > Reviewed-by: Suraj Jitindar Singh <surajjs@xxxxxxxxxx> > Reviewed-by: Benjamin Herrenschmidt <benh@xxxxxxxxxx> > Reviewed-by: Anchal Agarwal <anchalag@xxxxxxxxxx> > --- > This issue was fixed in v5.4 but didn't appear to make it to stable. The > fixed commit goes back to v2.6, so backport this to stable kernels > before v5.4. The only difference is a contextual change at > xlog_state_do_callback(log, XFS_LI_ABORTED, NULL); > Which in v5.4 is > xlog_state_do_callback(log, true, NULL); Now queued up, thanks. greg k-h