On 01/02 2014 08:45, Dave Chinner wrote: > On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote: >> On 12/30 2013 23:20 PM, Mark Tinguely wrote: >>> On 12/24/13 06:48, Jeff Liu wrote: >>>> From: Jie Liu<jeff.liu@xxxxxxxxxx> >>>> >>>> I can easily to hit a hang up while running fsstress and shutting down >>>> XFS on SSD via the tests below: >> <snip> >>>> >>>> Task1 Task2 >>>> >>>> list_add(&ctx->committing,&cil->xc_committing); >>>> >>>> xlog_wait(&cil->xc_commit_wait..) >>>> schedule()... >>>> >>>> Aborting!! list_del(&ctx->committing); >>>> wake_up_all(&cil->xc_commit_wait);<-- MISSING! >>>> >>>> As a result, we should handle this situation in xlog_cil_committed(). >>>> >>>> Signed-off-by: Jie Liu<jeff.liu@xxxxxxxxxx> >>>> --- >>>> fs/xfs/xfs_log_cil.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c >>>> index 5eb51fc..8c7e9c7 100644 >>>> --- a/fs/xfs/xfs_log_cil.c >>>> +++ b/fs/xfs/xfs_log_cil.c >>>> @@ -406,6 +406,8 @@ xlog_cil_committed( >>>> >>>> spin_lock(&ctx->cil->xc_push_lock); >>>> list_del(&ctx->committing); >>>> + if (abort) >>>> + wake_up_all(&ctx->cil->xc_commit_wait); >>>> spin_unlock(&ctx->cil->xc_push_lock); >>>> >>>> xlog_cil_free_logvec(ctx->lv_chain); >>> >>> Hi Jeff, I hope you had a good break, >> Thanks :) >>> >>> So you are saying the wakeup in the CIL push error path missing? >> Yes. >> >>> I agree with that. But I don't like adding a new wake up to >>> xlog_cil_committed(), which is after the log buffer is written. > > Hi Mark, any particular reason why you don't like this? It would be > great if you could explain why you don't like something up front so > we don't have to guess at your reasons or wait for another round > trip in the conversation to find them out.... > >> IMO this callback would be called if any problem is happened before >> the log buffer is written as well, e.g, >> xlog_cil_push()->xfs_log_notify() <-- failed >> | >> |->xlog_cil_committed() > > Right, it's the generic CIL commit handler and it can be called > directly or from log IO completion. > > The question is this: it is safe to wake up waiters from log IO > completion if that is where an abort is first triggered from (i.e. > on log IO error). From what I can see, it is safe to do the wakeup > on abort because the iclog iwe attach the IO completion callback to > in xlog_cil_push() cannot be put under IO until we release the > reference gained in xfs_log_done(). > > But this raises an interesting question - the wakeup in > xlog_cil_push() is done before the log IO for the checkpoint is > complete, so the wakeup is occurring on checkpoint processing > completion, not iclog IO completion. i.e. the actual log force > sleeping still needs to wait for log IO completion to occur after > then CIL has been pushed. This occurs in the _xfs_log_force{_lsn}() > wrappers, where iclog state changes are waited for. > > Why is this important? The iclog write/flush wakeups are all done > from IO completion context, except for the force shutdown case, > which calls xlog_state_do_callback(log, XFS_LI_ABORTED, NULL); to > trigger wakeups and aborts via the log IO completion callbacks on > all the outstanding iclogs. > > IOWs, we've already got a design pattern that says: > > - run log force wakeups from IO completions > - on shutdown, run IO completions directly to abort pending > log operations > > So, really, issuing wakeups from iclog IO completion on log aborts > or errors is exactly what we currently do to ensure that shutdowns > don't leave processes waiting on log force completion behind. So > from that perspective, adding the wakeup on abort to > xlog_cil_committed() seems like the right approach to take. > > Actually, there's more issues here: xlog_cil_push() leaks a > reference to the iclog when it triggers the error path via > xfs_log_notify() failure. At this point we always need to release > the iclog. Hence if xfs_log_notify() were to always add the IO > completion to the iclog and xlog_cil_committed() issued wakeups on > abort errors, then we could completely ignore the log state in > xfs_log_notify() and have xfs_log_release_iclog() capture the IO > error and the subsequent shutdown would handle the aborts and > wakeups.... There is indeed an iclog ref leak after digging into the code. > > Hmmm, then xfs_log_notify could go away, and the callback list could > be made a lockless list and the ic_callback_lock could go away, > too.... Hence we can fold xfs_log_notify() into xlog_cil_push() directly, but am not sure I get the reason why we could make the callback list lockless: When attaching the IO completion callback to iclog, we assert the iclog state to be XLOG_STATE_ACTIVE or XLOG_STATE_WANT_SYNC, but in the other place where we also try to get the ic_callback_lock, i.e, xlog_state_do_callback(), we only perform callbacks for iclogs that in XLOG_STATE_DONE_SYNC or in XLOG_STATE_DO_CALLBACK, so they're already prevented from the potential race situations, am I understood correctly? Also, it seems like the iclog->ic_callback_tail can go away as well, since it only serves as a left value. Thanks, -Jeff _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs