On 18.11.2021 13.19, Marek Szyprowski wrote: > Hi, > > On 15.11.2021 23:16, Mathias Nyman wrote: >> xHC hardware can only have one slot in default state with address 0 >> waiting for a unique address at a time, otherwise "undefined behavior >> may occur" according to xhci spec 5.4.3.4 >> >> The address0_mutex exists to prevent this across both xhci roothubs. >> >> If hub_port_init() fails, it may unlock the mutex and exit with a xhci >> slot in default state. If the other xhci roothub calls hub_port_init() >> at this point we end up with two slots in default state. >> >> Make sure the address0_mutex protects the slot default state across >> hub_port_init() retries, until slot is addressed or disabled. >> >> Note, one known minor case is not fixed by this patch. >> If device needs to be reset during resume, but fails all hub_port_init() >> retries in usb_reset_and_verify_device(), then it's possible the slot is >> still left in default state when address0_mutex is unlocked. >> >> Cc: <stable@xxxxxxxxxxxxxxx> >> Signed-off-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> > > This patch landed in linux next-20211118 as commit 6ae6dc22d2d1 ("usb: > hub: Fix usb enumeration issue due to address0 race"). On my test > systems it triggers the following deplock warning during system > suspend/resume cycle: > > ====================================================== > WARNING: possible circular locking dependency detected > 5.16.0-rc1-00014-g6ae6dc22d2d1 #4126 Not tainted > ------------------------------------------------------ > kworker/u16:8/738 is trying to acquire lock: > cf81f738 (hcd->address0_mutex){+.+.}-{3:3}, at: > usb_reset_and_verify_device+0xe8/0x3e4 > > but task is already holding lock: > cf80ab3c (&port_dev->status_lock){+.+.}-{3:3}, at: > usb_port_resume+0xa0/0x7e8 > Thanks, I see it now. Lock order is: mutex_lock(&port_dev->status_lock) mutex_lock(hcd->address0_mutex) mutex_unlock(hcd->address0_mutex) mutex_unlock(&port_dev->status_lock) in hub_port_connect(), usb_port_resume() and usb_reset_device() But patch changed the status_lock and address0_mutex lock order in hub_port_connect(). Lets see if we can take the status_lock a bit earlier in hub_port_connect() to solve this. -Mathias