Hello Zheng, On 7/7/23 12:24, Zheng Wang wrote: > In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with > mtk_jpeg_job_timeout_work. Then mtk_jpeg_dec_device_run > and mtk_jpeg_enc_device_run may be called to start the > work. > If we remove the module which will call mtk_jpeg_remove > to make cleanup, there may be a unfinished work. The > possible sequence is as follows, which will cause a > typical UAF bug. > > Fix it by canceling the work before cleanup in the mtk_jpeg_remove > > CPU0 CPU1 > > |mtk_jpeg_job_timeout_work > mtk_jpeg_remove | > v4l2_m2m_release | > kfree(m2m_dev); | > | > | v4l2_m2m_get_curr_priv > | m2m_dev->curr_ctx //use > Fixes: b2f0d2724ba4 ("[media] vcodec: mediatek: Add Mediatek JPEG Decoder Driver") > Signed-off-by: Zheng Wang <zyytlz.wz@xxxxxxx> > --- > - v2: use cancel_delayed_work_sync instead of cancel_delayed_work suggested by Kyrie. > --- > drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c > index 0051f372a66c..6069ecf420b0 100644 > --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c > +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c > @@ -1816,6 +1816,7 @@ static void mtk_jpeg_remove(struct platform_device *pdev) > { > struct mtk_jpeg_dev *jpeg = platform_get_drvdata(pdev); > > + cancel_delayed_work_sync(&jpeg->job_timeout_work); > pm_runtime_disable(&pdev->dev); > video_unregister_device(jpeg->vdev); > v4l2_m2m_release(jpeg->m2m_dev); AFAICS, there is a fundamental problem here. The job_timeout_work uses v4l2_m2m_get_curr_priv() and at the time when driver module is unloaded, all the v4l contexts must be closed and released. Hence the v4l2_m2m_get_curr_priv() shall return NULL and crash the kernel when work is executed before cancel_delayed_work_sync(). At the time when mtk_jpeg_remove() is invoked, there shall be no job_timeout_work running in background because all jobs should be completed before context is released. If you'll look at v4l2_m2m_cancel_job(), you can see that it waits for the task completion before closing context. You shouldn't be able to remove driver module while it has active/opened v4l contexts. If you can do that, then this is yours bug that needs to be fixed. In addition to this all, the job_timeout_work is initialized only for the single-core JPEG device. I'd expect this patch should crash multi-core JPEG devices. -- Best regards, Dmitry