Hi Chenyuan, On 28/12/2023 03:33, Yang, Chenyuan wrote: > Hello, > > > > We encountered 5 different crashes in the cec device by using our generated syscall specification for it, here are the descriptions of these 5 crashes and the related files are attached: > > 1. KASAN: slab-use-after-free Read in cec_queue_msg_fh (Reproducible) > > 2. WARNING: ODEBUG bug in cec_transmit_msg_fh > > 3. WARNING in cec_data_cancel > > 4. INFO: task hung in cec_claim_log_addrs (Reproducible) > > 5. general protection fault in cec_transmit_done_ts > > > > For “KASAN: slab-use-after-free Read in cec_queue_msg_fh”, we attached a syzkaller program to reproduce it. This crash is caused by ` list_add_tail(&entry->list, &fh->msgs);` > (https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L224 <https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L224>), which reads a > variable freed by `kfree(fh);` (https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-api.c#L684 > <https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-api.c#L684>). The reproducible program is a Syzkaller program, which can be executed following this document: > https://github.com/google/syzkaller/blob/master/docs/executing_syzkaller_programs.md <https://github.com/google/syzkaller/blob/master/docs/executing_syzkaller_programs.md>. > > > > For “WARNING: ODEBUG bug in cec_transmit_msg_fh”, unfortunately we failed to reproduce it but we indeed trigger this crash almost every time when we fuzz the cec device only. We attached the report > and log for this bug. It tries freeing an active object by using `kfree(data);` (https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L930 > <https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L930>). > > > > For “WARNING in cec_data_cancel”, it is an internal warning used in cec_data_cancel (https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L365 > <https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L365>), which checks whether the transmit is the current or pending. Unfortunately, we also don't have the > reproducible program for this bug, but we attach the report and log. > > > > For “INFO: task hung in cec_claim_log_addrs”, the kernel hangs when the cec device ` wait_for_completion(&adap->config_completion);` > (https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L1579 <https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L1579>). We have a > reproducible C program for this. > > > > For “general protection fault in cec_transmit_done_ts”, the cec device tries derefencing a non-canonical address 0xdffffc00000000e0: 0000 [#1], which is related to the invocation ` > cec_transmit_attempt_done_ts ` (https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L697 > <https://elixir.bootlin.com/linux/v6.7-rc7/source/drivers/media/cec/core/cec-adap.c#L697>). It seems that the address of cec_adapter is totally wrong. We do not have a reproducible program for this > bug, but the log and report for it are attached. > > > > If you have any questions or require more information, please feel free to contact us. Can you retest with the patch below? I'm fairly certain this will fix issues 1 and 2. I suspect at least some of the others are related to 1 & 2, but since I could never get the reproducers working reliably, I had a hard time determining if there are more bugs or if this patch resolves everything. Your help testing this patch will be appreciated! Regards, Hans Signed-off-by: Hans Verkuil <hverkuil-cisco@xxxxxxxxx> --- drivers/media/cec/core/cec-adap.c | 3 +-- drivers/media/cec/core/cec-api.c | 3 +++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/media/cec/core/cec-adap.c b/drivers/media/cec/core/cec-adap.c index 5741adf09a2e..079c3b142d91 100644 --- a/drivers/media/cec/core/cec-adap.c +++ b/drivers/media/cec/core/cec-adap.c @@ -936,8 +936,7 @@ int cec_transmit_msg_fh(struct cec_adapter *adap, struct cec_msg *msg, */ mutex_unlock(&adap->lock); wait_for_completion_killable(&data->c); - if (!data->completed) - cancel_delayed_work_sync(&data->work); + cancel_delayed_work_sync(&data->work); mutex_lock(&adap->lock); /* Cancel the transmit if it was interrupted */ diff --git a/drivers/media/cec/core/cec-api.c b/drivers/media/cec/core/cec-api.c index 67dc79ef1705..d64bb716f9c6 100644 --- a/drivers/media/cec/core/cec-api.c +++ b/drivers/media/cec/core/cec-api.c @@ -664,6 +664,8 @@ static int cec_release(struct inode *inode, struct file *filp) list_del_init(&data->xfer_list); } mutex_unlock(&adap->lock); + + mutex_lock(&fh->lock); while (!list_empty(&fh->msgs)) { struct cec_msg_entry *entry = list_first_entry(&fh->msgs, struct cec_msg_entry, list); @@ -681,6 +683,7 @@ static int cec_release(struct inode *inode, struct file *filp) kfree(entry); } } + mutex_unlock(&fh->lock); kfree(fh); cec_put_device(devnode); -- 2.42.0