On Fri, 2013-05-24 at 09:42 +1000, Dave Chinner wrote: > On Thu, May 23, 2013 at 01:09:02PM -0500, Chandra Seetharaman wrote: > > On Thu, 2013-05-23 at 09:41 +1000, Dave Chinner wrote: > > > On Wed, May 22, 2013 at 06:12:43PM -0500, Chandra Seetharaman wrote: > > > > Hello, > > > > > > > > While testing and rearranging my pquota/gquota code, I stumbled on a > > > > xfs_shutdown() during a mount. But the mount just hung. > > > > > > > > I debugged and found that it is in a code path where > > > > &log->l_cilp->xc_ctx_lock is first acquired in read mode and some levels > > > > down the same semaphore is being acquired in write mode causing a > > > > deadlock. > > > > > > > > This is the stack: > > > > xfs_log_commit_cil -> acquires &log->l_cilp->xc_ctx_lock in read mode > > > > xlog_print_tic_res > > > > xfs_force_shutdown > > > > xfs_log_force_umount > > > > xlog_cil_force > > > > xlog_cil_force_lsn > > > > xlog_cil_push_foreground > > > > xlog_cil_push - tries to acquire same semaphore in write mode > > > > > > Which means you had a transaction reservation overrun. Is it > > > reproducable? iDo you have the output from xlog_print_tic_res()? > > > Because: > > > > Here it is: > > > > May 23 10:48:52 test46 kernel: [ 77.500728] XFS (sdh8): xlog_write: reservation summary: > > May 23 10:48:52 test46 kernel: [ 77.500728] trans type = QM_SBCHANGE (26) > > May 23 10:48:52 test46 kernel: [ 77.500728] unit res = 2740 bytes > > May 23 10:48:52 test46 kernel: [ 77.500728] current res = -48 bytes > > May 23 10:48:52 test46 kernel: [ 77.500728] total reg = 0 bytes (o/flow = 0 bytes) > > May 23 10:48:52 test46 kernel: [ 77.500728] ophdrs = 0 (ophdr space = 0 bytes) > > May 23 10:48:52 test46 kernel: [ 77.500728] ophdr + reg = 0 bytes > > May 23 10:48:52 test46 kernel: [ 77.500728] num regions = 0 > > May 23 10:48:52 test46 kernel: [ 77.500728] > > > > Yes. I can readily reproduce the problem, but it is with my mangled up > > patchsets :). There is a small change that makes this problem reproduce > > consistently. > > Interesting. That implies that the CIL stole the reservation for the > checkpoint headers from this reservation, and then it overran by 48 > bytes. An increase in the number of quotas should not affect this. > > What is the xfs_info output on the filesystem that is triggering > this? I have the same set of patches, but it is not happening any more :(. I will keep trying. > > Cheers, > > Dave. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs