On Fri, Oct 10, 2014 at 03:31:46AM -0400, Xu Wang wrote: > From 59c6babab7c4ce8708036018143d2acc1477cc7f Mon Sep 17 00:00:00 2001 > From: George Wang <xuw@xxxxxxxxxx> > Date: Fri, 10 Oct 2014 15:02:14 +0800 > Subject: [PATCH] [PATCH] xfs: lock for removing item from cil->xc_cil in > xlog_cil_push > > There is a race condition when xlog_cil_force_lsn and xlog_cil_push for > cil->xc_cil items. When function xlog_cil_push_now called in > xlog_cil_force_lsn, and the xlog_cil_push run in workqueue just reaches the point for > setting cil->xc_current_sequence, the xlog_cil_force_lsn got the lock > and check the "sequence == cil->xc_current_sequence && > !list_empty(&cil->xc_cil)" for restart. The statement "sequence == > cil->xc_current_sequence" is true, but the cil->xc_cil is empty > according to the xlog_cil_push removed the items in cil->xc_cil without > lock protectionl. So the function xlog_cil_force_lsn will return > NULLCOMMITLSN, which means the cil log for current sequence will not be > submitted to disk. And if there is no more operations(for example, when > unmount fs, this situation happened in > xfs_unmountfs->xfs_log_sbcount->xfs_trans_commit), the xfs will hang up. > > Signed-off-by: George Wang <xuw@xxxxxxxxxx> > --- Isn't this fixed by the following commit? 8af3dcd3 xfs: xlog_cil_force_lsn doesn't always wait correctly With that patch, xlog_cil_push() adds the ctx to the committing list before draining the ctx and updating the current sequence number. xlog_cil_force_lsn() walks the committing list and checks the sequence number and ctx list, all under the push lock. That means that xlog_cil_force_lsn() should see the ctx either on the committing list and wait for it (commit_lsn != NULL), or not on the committing list at all. If it's not on the committing list yet, xlog_cil_push() won't have either drained the ctx or updated the sequence number, as it adds to the committing list first and that requires the push lock (which xlog_cil_force_lsn() holds and doesn't release until after the restart check). Am I missing something? Do you reproduce a problem with the latest tree that includes the above patch? Brian > fs/xfs/xfs_log_cil.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c > index f6b79e5..97ba527 100644 > --- a/fs/xfs/xfs_log_cil.c > +++ b/fs/xfs/xfs_log_cil.c > @@ -478,6 +478,8 @@ xlog_cil_push( > */ > lv = NULL; > num_iovecs = 0; > + > + spin_lock(&cil->xc_push_lock); > while (!list_empty(&cil->xc_cil)) { > struct xfs_log_item *item; > > @@ -530,7 +532,6 @@ xlog_cil_push( > * against the current sequence in log forces without risking > * deferencing a freed context pointer. > */ > - spin_lock(&cil->xc_push_lock); > cil->xc_current_sequence = new_ctx->sequence; > list_add(&ctx->committing, &cil->xc_committing); > spin_unlock(&cil->xc_push_lock); > -- > 1.9.3 > > > -- > George Wang 王旭 > > Kernel Quantity Engineer > Red Hat Software (Beijing) Co.,Ltd > IRC:xuw > Tel:+86-010-62608041 > Phone:15901231579 > 9/F, Tower C, Raycom > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs