On Tue, May 13, 2014 at 2:03 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Tue, May 13, 2014 at 12:02:18AM -0700, Austin Schuh wrote: >> On Mon, May 12, 2014 at 11:39 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> > On Mon, May 12, 2014 at 09:03:48PM -0700, Austin Schuh wrote: >> >> On Mon, May 12, 2014 at 8:46 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> >> > On Mon, May 12, 2014 at 06:29:28PM -0700, Austin Schuh wrote: >> >> >> On Wed, Mar 5, 2014 at 4:53 PM, Austin Schuh <austin@xxxxxxxxxxxxxxxx> wrote: >> >> >> > Hi Dave, >> >> >> > >> >> >> > On Wed, Mar 5, 2014 at 3:35 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> >> >> >> On Wed, Mar 05, 2014 at 03:08:16PM -0800, Austin Schuh wrote: >> >> >> >>> Howdy, >> >> >> >>> >> >> >> >>> I'm running a config_preempt_rt patched version of the 3.10.11 kernel, >> >> >> >>> and I'm seeing a couple lockups and crashes which I think are related >> >> >> >>> to XFS. >> >> >> >> >> >> >> >> I think they ar emore likely related to RT issues.... >> >> >> >> >> >> >> > >> >> >> > That very well may be true. >> >> >> > >> >> >> >> Cheers, >> >> >> >> >> >> >> >> Dave. >> >> >> >> -- >> >> >> >> Dave Chinner >> >> >> >> >> >> I had the issue reproduce itself today with just the main SSD >> >> >> installed. This was on a new machine that was built this morning. >> >> >> There is a lot less going on in this trace than the previous one. >> >> > >> >> > The three blocked threads: >> >> > >> >> > 1. kworker running IO completion waiting on an inode lock, >> >> > holding locked pages. >> >> > 2. kworker running writeback flusher work waiting for a page lock >> >> > 3. direct flush work waiting for allocation, holding page >> >> > locks and the inode lock. >> >> > >> >> > What's the kworker thread running the allocation work doing? >> >> > >> >> > You might need to run `echo w > proc-sysrq-trigger` to get this >> >> > information... >> >> >> >> I was able to reproduce the lockup. I ran `echo w > >> >> /proc/sysrq-trigger` per your suggestion. I don't know how to figure >> >> out what the kworker thread is doing, but I'll happily do it if you >> >> can give me some guidance. >> > >> > There isn't a worker thread blocked doing an allocation in that >> > dump, so it doesn't shed any light on the problem at all. try >> > `echo l > /proc/sysrq-trigger`, followed by `echo t > >> > /proc/sysrq-trigger` so we can see all the processes running on CPUs >> > and all the processes in the system... >> > >> > Cheers, >> > >> > Dave. >> >> Attached is the output of the two commands you asked for. > > Nothing there. There's lots of processes waiting for allocation to > run, and no kworkers running allocation work. This looks more > like a rt-kernel workqueue issue, not an XFS problem. > > FWIW, it woul dbe really helpful if you compiled your kernels with > frame pointers enabled - the stack traces are much more precise and > readable (i.e. gets rid of all the false/stale entrys) and that > helps understanding where things are stuck immensely. > > Cheers, > > Dave. Thanks Dave. I'll go check with the rt-kernel guys and take it from there. Thanks for the frame pointers suggestion. I'll make that change the next time I build a kernel. Austin _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs