Smita Koralahalli wrote: [..] > > +#define CXL_EVENT_HDR_FLAGS_REC_SEVERITY GENMASK(1, 0) > > +int cxl_cper_event_call(struct notifier_block *nb, unsigned long action, void *data) > > +{ > > + struct cxl_cper_notifier_data *nd = data; > > + struct cxl_event_record_raw record = (struct cxl_event_record_raw) { > > + .hdr.id = UUID_INIT(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), > > + }; > > + enum cxl_event_log_type log_type; > > + struct cxl_memdev_state *mds; > > + u32 hdr_flags; > > + > > + mds = container_of(nb, struct cxl_memdev_state, cxl_cper_nb); > > + > > + /* Need serial number for device identification */ > > + if (!(nd->rec->hdr.validation_bits & CPER_CXL_DEVICE_SN_VALID)) > > + return NOTIFY_DONE; > > For all the event records that I tested so far, this has never been > true. That is CPER_CXL_DEVICE_SN_VALID is never set which might not log > the records at all. Should we be bit more lenient here and include > validating device_id (bdf) instead and check if cxlds exist? Agree. While I do think those devices are out of spec given CXL mandates a valid serial number, I think the robustness priciple applies and Linux should rely on bdf information. I also expect that with MH-SLDs and potentially other scenarios, a serial number may be duplicated so bdf is more reliable in that dimension as well. > > + /* FIXME endianess and bytes of serial number need verification */ > > + /* FIXME Should other values be checked? */ > > + if (memcmp(&mds->cxlds.serial, &nd->rec->hdr.dev_serial_num, > > + sizeof(mds->cxlds.serial))) > > + return NOTIFY_DONE; > > + > > + /* ensure record can always handle the full CPER provided data */ > > + BUILD_BUG_ON(sizeof(record) < > > + (CPER_CXL_COMP_EVENT_LOG_SIZE + sizeof(record.hdr.id))); > > + > > + /* > > + * UEFI v2.10 defines N.2.14 defines the CXL CPER record as not > > + * including the uuid field. > > + */ > > + memcpy(&record.hdr.length, &nd->rec->comp_event_log, > > + CPER_CXL_REC_LEN(nd->rec)); > > I'm doubtful this will do the job. I think we should copy into each > field of struct cxl_event_record_hdr individually starting from length > by pointer arithmetic (which is definitely bad, but I cannot think of a > better way to do this) and then do memcpy for data field in struct > cxl_event_record_raw.. > > Any other suggestions would be helpful as well. > > I can make these changes and validate it on my end if that works..? It sounds like you have a more readily available real world test environment for this, so that sounds good to me.