On 9/16/20 11:49 AM, Mansur Alisha Shaik wrote: > For core ops we are having only write protect but there > is no read protect, because of this in multthreading > and concurrency, one CPU core is reading without wait > which is causing the NULL pointer dereferece crash. > > one such scenario is as show below, where in one CPU > core, core->ops becoming NULL and in another CPU core > calling core->ops->session_init(). > > CPU: core-7: > Call trace: > hfi_session_init+0x180/0x1dc [venus_core] > vdec_queue_setup+0x9c/0x364 [venus_dec] > vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common] > vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2] > v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem] > v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem] > v4l_reqbufs+0x4c/0x5c > __video_do_ioctl+0x2b0/0x39c > > CPU: core-0: > Call trace: > venus_shutdown+0x98/0xfc [venus_core] > venus_sys_error_handler+0x64/0x148 [venus_core] > process_one_work+0x210/0x3d0 > worker_thread+0x248/0x3f4 > kthread+0x11c/0x12c > > Signed-off-by: Mansur Alisha Shaik <mansur@xxxxxxxxxxxxxx> > Acked-by: Stanimir Varbanov <stanimir.varbanov@xxxxxxxxxx> > --- > Changes in V4: > - Addressed review comments by Stan in patch series > https://lore.kernel.org/patchwork/patch/1303678/ > and combining this change along with shutdown callback > as we are facing race conditions with shutdown callback > > drivers/media/platform/qcom/venus/hfi.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c > index a59022a..58d4c06 100644 > --- a/drivers/media/platform/qcom/venus/hfi.c > +++ b/drivers/media/platform/qcom/venus/hfi.c > @@ -195,19 +195,34 @@ EXPORT_SYMBOL_GPL(hfi_session_create); > int hfi_session_init(struct venus_inst *inst, u32 pixfmt) > { > struct venus_core *core = inst->core; > - const struct hfi_ops *ops = core->ops; > + const struct hfi_ops *ops; > int ret; > > + /* > + * If core shutdown is in progress or if we are in system > + * recovery, return an error as during system error recovery > + * session_init() can't pass successfully > + */ > + mutex_lock(&core->lock); > + if (!core->ops || core->sys_error) { > + mutex_unlock(&core->lock); > + return -EIO; > + } > + mutex_unlock(&core->lock); > + > if (inst->state != INST_UNINIT) > return -EINVAL; > > inst->hfi_codec = to_codec_type(pixfmt); > reinit_completion(&inst->done); > > + mutex_lock(&core->lock); > + ops = core->ops; This is not needed because we check core->ops for NULL under mutex held at the beginning of the function. Just keep ops initialization as it is in the original code. > ret = ops->session_init(inst, inst->session_type, inst->hfi_codec); > if (ret) > return ret; > > + mutex_unlock(&core->lock); > ret = wait_session_msg(inst); > if (ret) > return ret; > -- regards, Stan