Re: [PATCH] xfs: close xc_cil list_empty() races with cil commit sequence

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 08, 2015 at 10:09:09AM +1000, Dave Chinner wrote:
> On Wed, Jun 24, 2015 at 10:04:01AM -0400, Brian Foster wrote:
> > We have seen somewhat rare reports of the following assert from
> > xlog_cil_push_background() failing during ltp tests or somewhat
> > innocuous desktop root fs workloads (e.g., virt operations, initramfs
> > construction):
> > 
> > 	ASSERT(!list_empty(&cil->xc_cil));
> > 
> > The reasoning behind the assert is that the transaction has inserted
> > items to the CIL and hit background push codepath all with
> > cil->xc_ctx_lock held for reading. This locks out background commit from
> > emptying the CIL, which acquires the lock for writing. Therefore, the
> > reasoning is that the items previously inserted in the CIL should still
> > be present.
> > 
> > The cil->xc_ctx_lock read lock is not sufficient to protect the xc_cil
> > list, however, due to how CIL insertion is handled.
> > xlog_cil_insert_items() inserts and reorders the dirty transaction items
> > to the tail of the CIL under xc_cil_lock. It uses list_move_tail() to
> > achieve insertion and reordering in the same block of code. This
> > function removes and reinserts an item to the tail of the list. If a
> > transaction commits an item that was already logged and thus already
> > resides in the CIL, and said item is the sole item on the list, the
> > removal and reinsertion creates a temporary state where the list is
> > actually empty.
> 
> The only way I can see this occurring is that we have to be committing a transaction that
> modifies the same object as the previous transaction commit that
> is still running through xfs_log_commit_cil(). e.g. racing
> timestamp modifications on an inode, and the CIL is empty:
> 
> thread A			thread B
> 
> lock(inode)			lock(inode)
> xfs_trans_join(inode)		<schedule>
> xfs_trans_log(inode)
> xfs_trans_commit(tp)
>   xfs_log_commit_cil()
>     lock(xc_ctx_lock)
>     <add inode to cil>
>     xfs_trans_free_items()
>       unlock(inode)
>       <schedule>
> 				<runs with inode lock>
> 				xfs_trans_join(inode)
> 				xfs_trans_log(inode)
> 				xfs_trans_commit(tp)
> 				  xfs_log_commit_cil()
> 				    lock(xc_ctx_lock)
> 				    xlog_cil_insert_items()
> 				      xlog_cil_insert_format_items()
>       <runs again>
>       xlog_cil_push_background
> 	ASSERT(!list_empty(cil))      list_move_tail(item, cil)
> 

Yes. I don't have the exact details of the original reproducer but it
basically required transactions to recommit an already CIL-resident item
with that item as the only entry in the CIL. The first transaction
inserts or reinserts and heads to the background push while the second
reinserts and creates the transient empty list state. I don't think the
CIL necessarily has to be empty to start, but either way the above looks
about right to me.

> 
> If that is the race, then the fix appears simple to me: call
> xlog_cil_push_background() before xfs_trans_free_items() so that we
> push the CIL before we unlock the items we just added to the CIL.
> i.e.:
> 
> thread A			thread B
> 
> lock(inode)			lock(inode)
> xfs_trans_join(inode)		<schedule>
> xfs_trans_log(inode)
> xfs_trans_commit(tp)
>   xfs_log_commit_cil()
>     lock(xc_ctx_lock)
>     <add inode to cil>
>     log_cil_push_background
> 	ASSERT(!list_empty(cil))
>     xfs_trans_free_items()
>       unlock(inode)
>       <schedule>
> 				<runs with inode lock>
> 				xfs_trans_join(inode)
> 				xfs_trans_log(inode)
> 				xfs_trans_commit(tp)
> 				  xfs_log_commit_cil()
> 				    lock(xc_ctx_lock)
> 				    xlog_cil_insert_items()
> 				      xlog_cil_insert_format_items()
> 				      list_move_tail(item, cil)

That seems like it should work for this particular case. I suppose it
depends on the nature of the transactions as opposed to closing the race
directly. It would require independent transactions to relog the same
item without locking to expose this instance of the race again, which
probably should never happen.

That doesn't help us for the other push side users of list_empty(),
however. For example, what about xlog_cil_push_now(), etc.? AFAICS, that
code still seems racy.

> > This state is not valid and thus should never be observed by concurrent
> > transaction commit-side checks in the circumstances outlined above.
> > Update all of the xc_cil checks to acquire xc_cil_lock before assessing
> > the state of xc_cil.
> 
> That will reintroduce the problem of lock contention between the
> commit and the push sides of the CIL, which the cil/push lock
> separation was added to solve:
> 

If the lock is a problem, how about trying to close the race on the
insert side? E.g., something like the following in
xlog_cil_insert_items():

	if (!list_is_last(&lip->li_cil, &cil->xc_cil))
		list_move_tail(&lip->li_cil, &cil->xc_cil);

... such that the list with a single item is never transiently empty.
Thoughts?

Brian

> commit 4bb928cdb900d0614f4766d5f1ca5bc3844f7656
> Author: Dave Chinner <dchinner@xxxxxxxxxx>
> Date:   Mon Aug 12 20:50:08 2013 +1000
> 
>     xfs: split the CIL lock
>     
>     The xc_cil_lock is used for two purposes - to protect the CIL
>     itself, and to protect the push/commit state and lists. These are
>     two logically separate structures and operations, so can have their
>     own locks. This means that pushing on the CIL and the commit wait
>     ordering won't contend for a lock with other transactions that are
>     completing concurrently. As the CIL insertion is the hottest path
>     throught eh CIL, this is a big win.
>     
>     Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
>     Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
>     Signed-off-by: Ben Myers <bpm@xxxxxxx>
> 
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux