On 29.07.2016 17:41, Alex Damian wrote:
On Fri, Jul 29, 2016 at 2:53 PM, Greg KH <greg@xxxxxxxxx> wrote:
On Fri, Jul 29, 2016 at 10:58:03AM +0100, Alex Damian wrote:
Hi Greg,
I managed to reproduce with a untainted kernel, see dmesg paste below.
The stack seemed corrupted as well ?
I refered to it as a crash since after a couple of these issues, the
machine hard freezes - I set up a serial console via a USB cable, but
I don't get the kernel oops out of the machine. The network is also
dead before getting any data. I could not think of any other way to
get a console out of a Macbook - any ideas ?
There is a progressive level of deterioration going on below, this is
why I'm adding multiple pastes. See the obviously invalid pointer
0000000000000001 in 3rd paste below. Also, see the protection fault in
the last paste. To me, something is trampling all over memory, and it
is usb-related.
Not good, thanks for reproducing it without the closed kernel drivers.
If you disable the list debug kernel option, do you have any problems
with the machine? We aren't having any other reports of issues like
this at the moment, which makes me worry that it's something unique to
your situation/hardware.
I strongly suspect it's related to the macbook 12,1 hardware. I
haven't been able
to reproduce this with other machines, including other macbook
versions with the same peripherals.
This machine has never been stable in this particular peripheral configuration.
I had Apple run all HW diagnostics on the machine, I ran the memcheck
to verify that
the RAM is ok - all results are clean. The machine is very stable under Mac OSX.
And you don't know that it's a USB problem, only that USB is the one
that is showing the issue. Anyone could be writing over memory.
True. However it seems particularly related to the USB mouse - that's
how I manage
to reproduce the error.
Also, any chance you can use 'git bisect' to track down an offending
commit? I'm assuming that this used to work properly and something
recently caused the issue, correct?
The earliest kernels I've tested are in the 3.3 range. All kernels
before 4.7 just lock up.
4.7 is the first kernel where I have meaningful dmesg errors before
locking up. As such,
there is very little that I can do to bisect :(.
Going through xhci related issues that occurred during my vacation.
There is one command list related issue fixed in 4.8-rc3, any chance you could try it?
Alternatively just add the following patch added to 4.7:
33be126 xhci: always handle "Command Ring Stopped" events
Enabling xhci debug could reveal something.
echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html