Re: XFS assertion from truncate. (3.10-rc2)

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 22 May 2013 15:51:47 +1000

On Wed, May 22, 2013 at 01:29:38AM -0400, Dave Jones wrote:
> On Wed, May 22, 2013 at 03:12:43PM +1000, Dave Chinner wrote:
> 
>  > > [   36.339105] XFS (sda2): xfs_setattr_size: mask 0xa068 mismatch on file 0\xffffffb8\xffffffd3-\xffffff88\xffffffff\xffffffff
>  > 
>  > So, still the same strange mask. That just doesn't seem right.
> 
> any idea what I screwed up in the filename printing part ?

Nope.

Right now, I have nothing for you but disappointment....

>  > > [   36.350823] XFS: Assertion failed: 0, file: fs/xfs/xfs_iops.c, line: 730
>  > > [   36.359459] ------------[ cut here ]------------
>  > > [   36.365247] kernel BUG at fs/xfs/xfs_message.c:108!
>  > > [   36.371360] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>  > > [   36.379091] Modules linked in: xfs libcrc32c snd_hda_codec_realtek snd_hda_codec_hdmi microcode(+) pcspkr snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd soundcore pps_core
>  > > [   36.405431] CPU: 1 PID: 2887 Comm: cc1 Not tainted 3.10.0-rc2+ #4
>  > 
>  > Your compiler is triggering this? That doesn't seem likely...
> 
> yeah, though it seems pretty much anything that writes to that partition will cause it.
> Here's fsx, which died instantly...
> 
> [   34.938367] XFS (sda2): xfs_setattr_size: mask 0x2068 mismatch on file 
> 
> (Note, different mask this time)

Which has ATTR_FORCE set but not ATTR_KILL_SUID or ATTR_KILL_SGID.
And that, AFAICT, is impossible.

>  > This has come through the open path via handle_truncate(), which
>  > means that ATTR_MTIME|ATTR_CTIME|ATTR_OPEN|ATTR_FILE should also be
>  > set in the mask. They aren't, and that says to me that something
>  > else has been blottoed before XFS trips over this. Memory
>  > corruption?
>  >
>  > Can you print out the entire struct iattr? perhaps even hexdump it?
> 
> About to turn in for the night. If there's a shiny diff in my inbox in the morning,
> I'll try it.

I wouldn't lose sleep over it - I'm stumped at this point. I'll get
a working path print to you, at minimum...

> Tomorrow I'll also try running some older kernels with the same
> options to see if it's something new, or an older bug. This is a
> new machine, so it may be something that's been around for a
> while, and for whatever reason, my other machines don't hit
> this.

Another thing that just occurred to me - what compiler are you
using?  We had a report last week on #xfs that xfsdump was failing
with bad checksums because of link time optimisation (LTO) in
gcc-4.8.0. When they turned that off, everything worked fine. So if
you are using 4.8.0, perhaps trying a different compiler might be a
good idea, too.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs