Re: Crash in -next due to 'MIPS: Add cacheinfo support'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 10, 2017 at 11:15:31AM -0800, Florian Fainelli wrote:
> On 02/10/2017 09:46 AM, Guenter Roeck wrote:
> > On Fri, Feb 10, 2017 at 10:39:52AM +0000, James Hogan wrote:
> >> On Fri, Feb 10, 2017 at 09:40:11AM +0000, James Hogan wrote:
> >>> Hi Guenter,
> >>>
> >>> On Thu, Feb 09, 2017 at 08:50:04PM -0800, Guenter Roeck wrote:
> >>>> On 02/09/2017 04:01 PM, Justin Chen wrote:
> >>>>> Hello Guenter,
> >>>>>
> >>>>> I am unable to reproduce the problem. May you give me more details?
> >>>>>
> >>>> The scripts I am using are available at
> >>>>
> >>>> https://github.com/groeck/linux-build-test/tree/master/rootfs
> >>>>
> >>>> in the mips and mipsel subdirectories (both crash). Individual
> >>>> build logs are always available at kerneltests.org/builders,
> >>>> in the 'next' column.
> >>>
> >>> Did you find it 100% reproducible? I was trying to reproduce yesterday
> >>> evening, and did hit it a few times, but then it stopped happening
> >>> before I could try and figure out what was going wrong.
> >>
> >> I switched to gcc 5.2 (buildroot toolchain) for building kernel, and now
> >> it reproduces every time :)
> >>
> > gcc 5.4.0 (binutils 2.26.1) also reliably crashes. The exact crash depends on
> > the kernel (-next is different to ToT + offending patch, qemu command line
> > makes a difference, qemu version makes a difference), but the crash is easy
> > to reproduce, at least for me.
> 
> Just to clarify here, is the crash a combination of:
> 
> - the kernel image, based on different trees/branches

I tried with linux-next, and I tried with ToT with the offending patch
applied. Both fail reliably. Without the patch, both pass reliably.

> - different GCC versions

I can only say that I see it crashes with both gcc 5.2.0 and gcc 5.4.0.

> - different ways of invoking QEMU/QEMU version?
> 
Yes and no. One way of _not_ (or maybe not always) seeing this crash
is to use the bare-bone command line with qemu 2.7 or 2.8 (with no
root file system provided). This crashes because there is no root file
system, but not as described earlier. It does crash, though, when
providing a more comprehensive command line. All kernel versions
without this patch do not crash.

On the other side, blaming this on the qemu command line is akin to
saying that a crash only seen if a mouse is connected would be caused
by the mouse, so I am not entirely sure if that helps too much.
Also see below, regarding heisenbug.

> and essentially Justin's commit just made problem 1) to occur, but is
> not the root cause of the crash you are seeing?

That would not necessarily be my conclusion. Of course, the code appears
to be heavily SMP related, so it may well be that it exposes some
problem associated with cache handling or support in non-SMP configurations.

Of course, it might also be possible that there is a qemu problem somewhere
which only manifests itself on non-SMP mips images with Justin's commit
applied. That appears to be somewhat unlikely, though I have no hard data
supporting this guess.

I'll do some more testing and try to find the actual crash location.
Tricky though since it almost looks like there is a not completely
initialized workqueue. Making things worse, the problem "goes away"
if I add some debug log into process_one_work(), meaning there may
be a heisenbug.

Guenter




[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux