On Tue, 2013-07-16 at 10:54 +1000, Dave Chinner wrote: > On Mon, Jul 15, 2013 at 05:52:34PM -0500, Chandra Seetharaman wrote: > > While testing and rearranging my pquota/gquota code, I stumbled > > on a xfs_shutdown() during a mount. But the mount just hung. > > > > I debugged and found that there is a deadlock involving > > &log->l_cilp->xc_ctx_lock. > > > > It is in a code path where &log->l_cilp->xc_ctx_lock is first > > acquired in read mode and some levels down the same semaphore > > is being acquired in write mode causing a deadlock. > > > > This is the stack: > > xfs_log_commit_cil -> acquires &log->l_cilp->xc_ctx_lock in read mode > > xlog_print_tic_res > > xfs_force_shutdown > > xfs_log_force_umount > > xlog_cil_force > > xlog_cil_force_lsn > > xlog_cil_push_foreground > > xlog_cil_push - tries to acquire same semaphore in write mode > > > > This patch fixes the deadlock by not calling xfs_force_shutdown() while > > holding the semaphore, instead calling it after dropping teh semaphore. > > > > Thanks to Dave for suggesting this solution. > > > > Signed-off-by: Chandra Seetharaman <sekharan@xxxxxxxxxx> > > > > --- > > fs/xfs/xfs_log.c | 6 +++--- > > fs/xfs/xfs_log_cil.c | 10 ++++++---- > > fs/xfs/xfs_log_priv.h | 2 +- > > fs/xfs/xfs_trans.c | 2 +- > > 4 files changed, 11 insertions(+), 9 deletions(-) > > > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > > index d852a2b..b9fa2da 100644 > > --- a/fs/xfs/xfs_log.c > > +++ b/fs/xfs/xfs_log.c > > @@ -1837,7 +1837,7 @@ xlog_state_finish_copy( > > * print out info relating to regions written which consume > > * the reservation > > */ > > -void > > +int > > xlog_print_tic_res( > > struct xfs_mount *mp, > > struct xlog_ticket *ticket) > > @@ -1941,7 +1941,7 @@ xlog_print_tic_res( > > > > xfs_alert_tag(mp, XFS_PTAG_LOGRES, > > "xlog_write: reservation ran out. Need to up reservation"); > > - xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); > > + return EFSCORRUPTED; > > Note the "SHUTDOWN_CORRUPT_INCORE" reason given here.... > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > index 35a2299..d96022f 100644 > > --- a/fs/xfs/xfs_trans.c > > +++ b/fs/xfs/xfs_trans.c > > @@ -1547,7 +1547,7 @@ xfs_trans_commit( > > xfs_trans_apply_dquot_deltas(tp); > > > > error = xfs_log_commit_cil(mp, tp, &commit_lsn, flags); > > - if (error == ENOMEM) { > > + if (error) { > > xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > Which is different to the reason given here. The shutdown reason > should be maintained for this particular error.... I see. Is it ok if the error reason is not propagated to the xlog_write() code path ? > > Cheers, > > Dave. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs