Re: XFS kernel panic bug?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 12, 2014 at 06:28:13AM +0100, Justin Clift wrote:
> Hi Niels,
> 
> 4 out of 5 Rackspace slave VM's hung overnight.  Rebooted one of
> them and it didn't come back.  Checking out it's console (they
> have an in-browser Java applet for it) showed a kernel traceback.
> 
> Scrollback for the console (took some effort ;>) is attached.
> 
> It's showing a bunch of XFS messages from the system shutdown,
> but then (down a page or two) a traceback starts.
> 
> Does this look like a potential XFS bug, or is it more hinting
> something else is wrong?  eg improper cleanup script, or we
> need kernel settings adjusted, or ?

This does not look like a kernel panic, but a case where XFS fails to 
allocate space in time and outstanding data can not be written to the 
filesystem.

There are tasks like these "flush-7:2:15212":
- major-device-id=7 (loop)
- minor-device-id=2 (/dev/loop1)
- pid=15212

The mountpoint /d can not be unmounted, likely because /d contains the 
files that are setup to provide /dev/loop*. When writing the data to 
/dev/loop* fails, /d will stay busy and the killall service on shutdown 
will not be able to kill the running processes (they are in D-state 
because they are waiting for for writing to finish).

My guess would be that there is a deadlock somewhere, not necessarily 
a bug in XFS, but in any of the components that are mentioned in the 
backtraces. It can also be that the system tries to allocate memory on 
one side, which causes flushing of data in the loop-devices, which 
requires memory to finish, <go back to start of this sentence>.

It could be that doing a shutdown and a good ordered cleanup before, 
makes rebooting more reliable. Or, make sure that the system has enough 
memory so that writing to loop devices never fails, or, don't use loop 
devices and use "real" disks.

If you capture a vmcore (needs kdump installed and configured), we may 
be able to see the cause more clearly.

HTH,
Niels
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux