Re: XFS crashing system with general protection fault

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dave,

On Tue, 10 Feb 2015 08:24:20 +1100 Dave Chinner wrote:
> On Mon, Feb 09, 2015 at 09:47:01AM +0100, Bruno Prémont wrote:
> > On Fri, 6 Feb 2015 09:15:16 +1100 Dave Chinner wrote:
> > > On Thu, Feb 05, 2015 at 03:10:07PM +0100, Bruno Prémont wrote:
> > > > New crash, new trace, this time on 3.18.2.
> > > > It looks like this time a NULL dereference happened prior to touched memory poison being detected.
> > > > 
> > > > Once again it's during normal system operation (no mount/umount activity)
> > > 
> > > Can you rebuild the kernel with CONFIG_XFS_WARN=y and see if that
> > > throws any interesting messages into logs?
> > 
> > Will try and see
> > 
> > > However:
> > > 
> > > > [1900390.261491] =============================================================================
> > > > [1900390.272989] BUG task_struct (Tainted: G      D W     ): Poison overwritten
> > > > [1900390.283021] -----------------------------------------------------------------------------
> > > > [1900390.283021] 
> > > > [1900390.297056] INFO: 0xffff880213d651b3-0xffff880213d651b3. First byte 0x6d instead of 0x6b
> > > > [1900390.309044] INFO: Slab 0xffffea00084f5800 objects=16 used=16 fp=0x          (null) flags=0x8000000000004080
> > > > [1900390.323087] INFO: Object 0xffff880213d64ba0 @offset=19360 fp=0xffff880213d61e40
> > > > [1900390.323087] 
> > > > [1900390.336988] Bytes b4 ffff880213d64b90: 60 2d d6 13 02 88 ff ff 5a 5a 5a 5a 5a 5a 5a 5a  `-......ZZZZZZZZ
> > > > [1900390.350988] Object ffff880213d64ba0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
> > > > [1900390.364943] Object ffff880213d64bb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
> > > ....
> > > > [1900391.674636] Object ffff880213d651b0: 6b 6b 6b 6d 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkmkkkkkkkkkkkk
> > >                                                      ^^
> > > 
> > > There's a single bit that has been flipped in the task_struct slab.
> > > So more than just XFS is seeing memory corruption - this is in core
> > > kernel structure slab caches. I'm not sure, either, how XFS could
> > > cause corruption in this slab.
> > > 
> > > So, I'd be checking all the previous memory corruptions to see if
> > > they are single bit errors, and if there is any pattern to the
> > > addresses at which they occur. The above bit flip makes me think
> > > "hardware issue" and everything else stems from that...
> > 
> > System has ECC RAM so faulty RAM looks less probable (no complaint seen
> > by kernel nor recorded by firmware).
> 
> Sure, but that's not the only hardware in the memory path so single
> bit errors can occur elsewhere as data moved across the bus of sits
> in cpu caches. and if you're not using an IOMMU then it could even
> be hardware writing to memory incorrectly...
> 
> > All previous crashes for which I have some logs were dereference after
> > free but not attempt to allocate memory from a modified poison in free
> > slabs.
> > 
> > Though what does that single bit represent in that area if it was
> > used/modified after free?
> 
> It means that there's either a use after free, or you have a
> hardware problem. being in the task struct slab, if it's a use after
> free then it's unlikely to be an XFS problem.

I mean what field does the affected byte/bit belong to in task_struct
in order to see if it could be some write-after-free (of a task_struct)
or not.

> FWIW, can you post the output of "grep PARAVIRT <kernel config
> file>"?

grep does not find any match (full config, prior to enabling XFS_WARN
attached).

Cheers,
Bruno

Attachment: xfs.config
Description: Binary data

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux