Ira Weiny wrote: > Smatch caught that cxl_cper_post_event() is called with a spinlock held > or preemption disabled.[1] The callback takes the device lock to > perform address translation and therefore might sleep. The record data > is released back to BIOS in ghes_clear_estatus() which requires it to be > copied for use in the workqueue. > > Copy the record to a lockless list and schedule a work item to process > the record outside of atomic context. > > [1] https://lore.kernel.org/all/b963c490-2c13-4b79-bbe7-34c6568423c7@moroto.mountain/ > > Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> > Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx> > --- > Changes in v2: > - djbw: device_lock() sleeps so we need to call the callback in process context > - iweiny: create work queue to handle processing the callback > - Link to v1: https://lore.kernel.org/r/20240202-cxl-cper-smatch-v1-1-7a4103c7f5a0@xxxxxxxxx > --- > drivers/acpi/apei/ghes.c | 44 +++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 41 insertions(+), 3 deletions(-) > [..] > +static DECLARE_WORK(cxl_cper_work, cxl_cper_work_fn); > + > static void cxl_cper_post_event(enum cxl_event_type event_type, > struct cxl_cper_event_rec *rec) > { > + struct cxl_cper_work_item *wi; > + > if (rec->hdr.length <= sizeof(rec->hdr) || > rec->hdr.length > sizeof(*rec)) { > pr_err(FW_WARN "CXL CPER Invalid section length (%u)\n", > @@ -721,9 +752,16 @@ static void cxl_cper_post_event(enum cxl_event_type event_type, > return; > } > > - guard(rwsem_read)(&cxl_cper_rw_sem); > - if (cper_callback) > - cper_callback(event_type, rec); Given a work function can be set atomically there is no need to create / manage a registration lock. Set a 'struct work' instance to a CXL provided routine on cxl_pci module load and restore it to a nop function + cancel_work_sync() on cxl_pci module exit. > + wi = kmalloc(sizeof(*wi), GFP_ATOMIC); The system is already under distress trying to report an error it should not dip into emergency memory reserves to report errors. Use a kfifo() similar to how memory_failure_queue() avoids memory allocation in the error reporting path.