On Thu, Jul 15, 2021 at 09:57:22AM -0700, Bhaumik Bhatt wrote: > On 2021-07-15 02:45 AM, Jia-Ju Bai wrote: > > Hello, > > > > I find there is a possible ABBA deadlock in the MHI driver in Linux > > 5.10: > > > > In mhi_pm_m0_transition(): > > 262: read_lock_bh(&mhi_cntrl->pm_lock); > > 281: spin_lock_irq(&mhi_cmd->lock); > > > > In mhi_send_cmd(): > > 1181: spin_lock_bh(&mhi_cmd->lock); > > 1207: read_lock_bh(&mhi_cntrl->pm_lock); > > > > When mhi_pm_m0_transition() and mhi_send_cmd() are concurrently > > executed, the deadlock can occur. > > > > I check the code and find a possible case of such concurrent execution: > > > > #CPU1: > > mhi_poll (mhi_event->process_event(...)) > > mhi_process_ctrl_ev_ring > > mhi_pm_m0_transition > > > > #CPU2: > > mhi_prepare_for_transfer > > mhi_prepare_channel > > mhi_send_cmd > > > > Note that mhi_poll() and mhi_prepare_for_transfer() are both exported > > by EXPORT_SYMBOL. > > Thus, I guess these two functions could be concurrently called by a MHI > > driver. > > > > I am not quite sure whether this possible deadlock is real and how to > > fix it if it is real. > > Any feedback would be appreciated, thanks :) > > > > > > Best wishes, > > Jia-Ju Bai > > Few pointers from your example: > > 1. mhi_poll() is currently not used by any client upstream yet. Then this shouldn't be added in first place... :/ > 2. Polling is not to be used for single event ring (shared control + data) > cases > since it is meant to be for client drivers with dedicated data packets only. > 3. mhi_send_cmd() will always be called after an mhi_pm_m0_transition() has > completed by design since we wait for the device to be held in M0 prior to > it. > But client can be unloaded during M0 event! Anyway, I don't think the deadlock scenario is valid because of the usage of "read_lock_bh()". So if "mhi_send_cmd()" has acquired "spin_lock_bh(&mhi_cmd->lock)", it can always acquire "read_lock_bh(&mhi_cntrl->pm_lock)" as multiple readers can acquire the read lock. Deadlock would only occur if one of the functions take write lock. Thanks for auditing. Regards, Mani > Would like to know what Mani and Hemant have to say. I don't think we can > run in > to the scenario from your example so we should be safe. > > Thanks, > Bhaumik > --- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project