On Thursday 01 September 2016 10:32 AM, Viresh Kumar wrote:
On 31-08-16, 12:46, Alan Stern wrote:
On Wed, 31 Aug 2016, Viresh Kumar wrote:
On 05-08-16, 11:51, Alan Stern wrote:
+++ usb-4.x/drivers/usb/core/hub.c
@@ -1052,7 +1052,7 @@ static void hub_activate(struct usb_hub
/* Continue a partial initialization */
if (type == HUB_INIT2 || type == HUB_INIT3) {
- device_lock(hub->intfdev);
+ device_lock(&hdev->dev);
Hi Alan,
I have received reports of kernel crashes (NULL dereference) due to this patch
in some of the corner cases. Note that we have backported this patch (and few
other) to 3.10 kernel. I have attached my hub.c file as well for reference.
Here is the reported kernel OOPs:
[ 19.476228] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 19.476231] pgd = ffffffc00007d000
[ 19.476236] [00000000] *pgd=000000000e90b003, *pmd=000000000e90c003, *pte=00e00000f9000407
[ 19.476242] Internal error: Oops: 96000045 [#1] PREEMPT SMP
[ 19.476273] Modules linked in: gb_vibrator(O) gb_usb(O) gb_uart(O) gb_spi(O) gb_sdio(O) gb_raw(O) gb_pwm(O) gb_power_supply(O) gb)
[ 19.476279] CPU: 0 PID: 344 Comm: kworker/0:3 Tainted: G O 3.10.97-g4b7224f-dirty #454
[ 19.476290] Workqueue: events hub_init_func2
[ 19.476293] task: ffffffc09b3560c0 ti: ffffffc09ada8000 task.ti: ffffffc09ada8000
[ 19.476300] PC is at __mutex_lock_slowpath+0x138/0x224
[ 19.476303] LR is at __mutex_lock_slowpath+0x128/0x224
...
[ 19.476582] Call trace:
[ 19.476586] [<ffffffc000ccf13c>] __mutex_lock_slowpath+0x138/0x224
[ 19.476590] [<ffffffc000ccf254>] mutex_lock+0x2c/0x48
[ 19.476593] [<ffffffc0006f6eac>] hub_activate+0x50/0x4d8
[ 19.476596] [<ffffffc0006f7388>] hub_init_func2+0x14/0x1c
[ 19.476602] [<ffffffc0002387ac>] process_one_work+0x26c/0x3cc
[ 19.476605] [<ffffffc000239988>] worker_thread+0x208/0x358
[ 19.476610] [<ffffffc00023f360>] kthread+0xbc/0xc4
...
This happens when the device is infinitely generating connected and then removed
(not manually, but due to some hardware issues).
If I'm reading this right, it means that hub->hdev is NULL in
I am not sure I am in sync here :(
hub_activate().
We would have gotten the crash right from hub_activate() in that case, isn't it?
The fact that the call sequence reached mutex_lock() here, it means that
hub->hdev->dev was valid at least. The mutex dev->mutex is somewhat corrupted or
uninitialized, etc.. And that's where it all went wrong. As
__mutex_lock_slowpath() is called, it means that the mutex had a count of 0
instead of 1 during the lock and then we crashed during __mutex_lock_slowpath(),
which can only happen if the mutex is uninitialized in the first place.
The mutex gets initialized as part of device_add() and so things can go wrong if
device_add() was skipped here somehow.
I may be completely wrong, but that's what I read :)
I don't see how that could be true; it is initialized
to a non-NULL value and then never changed until the hub structure is
deallocated.
Is it possible to add some debugging printk's in there to find out
what's happening?
I will ask Vaibhav (cc'd) to do it, not sure if he is around today or not.
Thanks for your quick reply.
I have some more update on this,
It seems the culprit was my laptop USB port (I have to say bad port),
which resulted into continuous connect/disconnect event, on both
laptop and phone (Android), which eventually resulted into kernel panic.
It could be a memory corruption or race somewhere (not sure though),
which is where device is not initialized properly when execution reached
to hub_activate().
I connected phone to another USB port and things started working as
expected.
I can put prints to trace the execution flow, lets see what comes out
from it.
--
Thanks,
Vaibhav
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html