On 10/24/2024 7:39 AM, Simon Horman wrote:
On Tue, Oct 22, 2024 at 10:35:27AM -0700, Pavan Kumar Linga wrote:
In an event where the platform running the device control plane
is rebooted, reset is detected on the driver. It releases
all the resources and waits for the reset to complete. Once the
reset is done, it tries to build the resources back. At this
time if the device control plane is not yet started, then
the driver timeouts on the virtchnl message and retries to
establish the mailbox again.
In the retry flow, mailbox is deinitialized but the mailbox
workqueue is still alive and polling for the mailbox message.
This results in accessing the released control queue leading to
null-ptr-deref. Fix it by unrolling the work queue cancellation
and mailbox deinitialization in the order which they got
initialized.
Also remove the redundant scheduling of the mailbox task in
idpf_vc_core_init.
I think it might be better to move this last change into a separate patch
targeted at iwl rather than iwl-net. It isn't a fix, right?
If we do not make that change in this patch, it results in calling
cancel_delayed_work_sync twice in case of an error in idpf_vc_core_init
(err_intr_req). Looks like there aren't any side effects of calling
cancel_delayed_work_sync twice. I will remove the said changes from this
patch.
Thanks,
Pavan
Fixes: 4930fbf419a7 ("idpf: add core init and interrupt request")
Fixes: 34c21fa894a1 ("idpf: implement virtchnl transaction manager")
Cc: stable@xxxxxxxxxxxxxxx # 6.9+
Reviewed-by: Tarun K Singh <tarun.k.singh@xxxxxxxxx>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@xxxxxxxxx>
...