Re: [PATCH RFC v2 3/3] cxl/memdev: Register for and process CPER events

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Smita Koralahalli wrote:
[..]
> > +#define CXL_EVENT_HDR_FLAGS_REC_SEVERITY GENMASK(1, 0)
> > +int cxl_cper_event_call(struct notifier_block *nb, unsigned long action, void *data)
> > +{
> > +	struct cxl_cper_notifier_data *nd = data;
> > +	struct cxl_event_record_raw record = (struct cxl_event_record_raw) {
> > +		.hdr.id = UUID_INIT(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
> > +	};
> > +	enum cxl_event_log_type log_type;
> > +	struct cxl_memdev_state *mds;
> > +	u32 hdr_flags;
> > +
> > +	mds = container_of(nb, struct cxl_memdev_state, cxl_cper_nb);
> > +
> > +	/* Need serial number for device identification */
> > +	if (!(nd->rec->hdr.validation_bits & CPER_CXL_DEVICE_SN_VALID))
> > +		return NOTIFY_DONE;
> 
> For all the event records that I tested so far, this has never been 
> true. That is CPER_CXL_DEVICE_SN_VALID is never set which might not log 
> the records at all. Should we be bit more lenient here and include 
> validating device_id (bdf) instead and check if cxlds exist?

Agree. While I do think those devices are out of spec given CXL mandates
a valid serial number, I think the robustness priciple applies and Linux
should rely on bdf information. I also expect that with MH-SLDs and
potentially other scenarios, a serial number may be duplicated so bdf is
more reliable in that dimension as well.

> > +	/* FIXME endianess and bytes of serial number need verification */
> > +	/* FIXME Should other values be checked? */
> > +	if (memcmp(&mds->cxlds.serial, &nd->rec->hdr.dev_serial_num,
> > +		   sizeof(mds->cxlds.serial)))
> > +		return NOTIFY_DONE;
> > +
> > +	/* ensure record can always handle the full CPER provided data */
> > +	BUILD_BUG_ON(sizeof(record) <
> > +		(CPER_CXL_COMP_EVENT_LOG_SIZE + sizeof(record.hdr.id)));
> > +
> > +	/*
> > +	 * UEFI v2.10 defines N.2.14 defines the CXL CPER record as not
> > +	 * including the uuid field.
> > +	 */
> > +	memcpy(&record.hdr.length, &nd->rec->comp_event_log,
> > +		CPER_CXL_REC_LEN(nd->rec));
> 
> I'm doubtful this will do the job. I think we should copy into each 
> field of struct cxl_event_record_hdr individually starting from length 
> by pointer arithmetic (which is definitely bad, but I cannot think of a 
> better way to do this) and then do memcpy for data field in struct 
> cxl_event_record_raw..
> 
> Any other suggestions would be helpful as well.
> 
> I can make these changes and validate it on my end if that works..?

It sounds like you have a more readily available real world test
environment for this, so that sounds good to me.



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux