On Monday 12 of March 2012, Dave Chinner wrote: > On Fri, Mar 09, 2012 at 08:28:47PM +0100, Arkadiusz Miśkiewicz wrote: > > Are there any bugs in area visible in tracebacks below? I have a system > > where one operation (upgrade of single rpm package) causes rpm process > > to hang in D-state, sysrq-w below: > > > > [ 400.755253] SysRq : Show Blocked State > > [ 400.758507] task PC stack pid father > > [ 400.758507] rpm D 0000000100005781 0 8732 8698 > > 0x00000000 [ 400.758507] ffff88021657dc48 0000000000000086 > > ffff880200000000 ffff88025126f480 [ 400.758507] ffff880252276630 > > ffff88021657dfd8 ffff88021657dfd8 ffff88021657dfd8 [ 400.758507] > > ffff880252074af0 ffff880252276630 ffff88024cb0d005 ffff88021657dcb0 [ > > 400.758507] Call Trace: > > [ 400.758507] [<ffffffff8114b22a>] ? kmem_cache_free+0x2a/0x110 > > [ 400.758507] [<ffffffff8114d2ed>] ? kmem_cache_alloc+0x11d/0x140 > > [ 400.758507] [<ffffffffa00df3c7>] ? kmem_zone_alloc+0x67/0xe0 [xfs] > > [ 400.758507] [<ffffffff8148b78a>] schedule+0x3a/0x50 > > [ 400.758507] [<ffffffff8148d25d>] rwsem_down_failed_common+0xbd/0x150 > > [ 400.758507] [<ffffffff8148d303>] rwsem_down_write_failed+0x13/0x20 > > [ 400.758507] [<ffffffff812652a3>] > > call_rwsem_down_write_failed+0x13/0x20 [ 400.758507] > > [<ffffffff8148c8ed>] ? down_write+0x2d/0x40 > > [ 400.758507] [<ffffffffa00cf97c>] xfs_ilock+0xcc/0x120 [xfs] > > [ 400.758507] [<ffffffffa00d4ace>] xfs_setattr_nonsize+0x1ce/0x5b0 > > [xfs] [ 400.758507] [<ffffffff81265502>] ? > > __strncpy_from_user+0x22/0x60 [ 400.758507] [<ffffffffa00d52ab>] > > xfs_vn_setattr+0x1b/0x40 [xfs] [ 400.758507] [<ffffffff8117c1a2>] > > notify_change+0x1a2/0x340 > > [ 400.758507] [<ffffffff8115ed80>] chown_common+0xd0/0xf0 > > [ 400.758507] [<ffffffff8115fe4c>] sys_chown+0xac/0x1a0 > > [ 400.758507] [<ffffffff81495112>] system_call_fastpath+0x16/0x1b > > I can't see why we'd get a task stuck here - it's waiting on the > XFS_ILOCK_EXCL. The only reason for this is if we leaked an unlock > somewhere. It appears you can reproduce this fairly quickly, linux vserver patch [1] seems to be messing with locking. Would be nice if you could make a quick look at it to see if it can be considered guilty part? On the other hand I wasn't able to reproduce on 3.0.22. vserver patch for .22 [2] is doing the same thing as vserver patch for 3.2.9. > so > running an event trace via trace-cmd for all the xfs_ilock trace > points and posting the report output might tell us what inode is > blocked and where we leaked (if that is the cause). Will try to get more information but this will take some time (most likely weeks) to get this machine down for debugging. > Cheers, > Dave. 1. http://vserver.13thfloor.at/Experimental/patch-3.2.9-vs2.3.2.7.diff 2. http://vserver.13thfloor.at/Experimental/patch-3.0.22-vs2.3.2.3.diff -- Arkadiusz Miśkiewicz PLD/Linux Team arekm / maven.pl http://ftp.pld-linux.org/ _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs