Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load

Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> · Sun, 18 Dec 2022 15:55:53 +0000

On Sun, 18 Dec 2022 08:25:34 +0800
johnny <johnny.li@xxxxxxxxxxxxxxxx> wrote:

> On Fri, Dec 16, 2022 at 01:54:01PM -0800, Ira Weiny (ira.weiny@xxxxxxxxx) wrote:
> > On Fri, Dec 16, 2022 at 03:39:39PM +0000, Jonathan Cameron wrote:  
> > > On Sun, 11 Dec 2022 23:06:20 -0800
> > > ira.weiny@xxxxxxxxx wrote:
> > >   
> > > > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> > > > 
> > > > CXL devices have multiple event logs which can be queried for CXL event
> > > > records.  Devices are required to support the storage of at least one
> > > > event record in each event log type.
> > > > 
> > > > Devices track event log overflow by incrementing a counter and tracking
> > > > the time of the first and last overflow event seen.
> > > > 
> > > > Software queries events via the Get Event Record mailbox command; CXL
> > > > rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> > > > 8.2.9.2.3 Clear Event Records mailbox command.
> > > > 
> > > > If the result of negotiating CXL Error Reporting Control is OS control,
> > > > read and clear all event logs on driver load.
> > > > 
> > > > Ensure a clean slate of events by reading and clearing the events on
> > > > driver load.
> > > > 
> > > > The status register is not used because a device may continue to trigger
> > > > events and the only requirement is to empty the log at least once.  This
> > > > allows for the required transition from empty to non-empty for interrupt
> > > > generation.  Handling of interrupts is in a follow on patch.
> > > > 
> > > > The device can return up to 1MB worth of event records per query.
> > > > Allocate a shared large buffer to handle the max number of records based
> > > > on the mailbox payload size.
> > > > 
> > > > This patch traces a raw event record and leaves specific event record
> > > > type tracing to subsequent patches.  Macros are created to aid in
> > > > tracing the common CXL Event header fields.
> > > > 
> > > > Each record is cleared explicitly.  A clear all bit is specified but is
> > > > only valid when the log overflows.
> > > > 
> > > > Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>  
> > > 
> > > A few things noticed inline.  I've tightened the QEMU code to reject the
> > > case of the input payload claims to be bigger than the mailbox size
> > > and hacked the size down to 256 bytes so it triggers the problem
> > > highlighted below.  
> > 
> > I'm not sure what you did here.
> >   
> > >   
> > > > 
> > > > ---
> > > > Changes from V3:
> > > > 	Dan
> > > > 		Split off _OSC pcie bits
> > > > 			Use existing style for host bridge flag in that
> > > > 			patch
> > > > 		Clean up event processing loop
> > > > 		Use dev_err_ratelimited()
> > > > 		Clean up version change log
> > > > 		Delete 'EVENT LOG OVERFLOW'
> > > > 		Remove cxl_clear_event_logs()
> > > > 		Add comment for native cxl control
> > > > 		Fail driver load on event buf allocation failure
> > > > 		Comment why events are not processed without _OSC flag
> > > > ---
> > > >  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
> > > >  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
> > > >  drivers/cxl/cxl.h        |  12 ++++
> > > >  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
> > > >  drivers/cxl/pci.c        |  40 ++++++++++++
> > > >  5 files changed, 392 insertions(+)
> > > > 
> > > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > > > index b03fba212799..9fb327370e08 100644
> > > > --- a/drivers/cxl/core/mbox.c
> > > > +++ b/drivers/cxl/core/mbox.c  
> > >   
> > > > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > > > +				  enum cxl_event_log_type log,
> > > > +				  struct cxl_get_event_payload *get_pl)
> > > > +{
> > > > +	struct cxl_mbox_clear_event_payload payload = {
> > > > +		.event_log = log,
> > > > +	};
> > > > +	u16 total = le16_to_cpu(get_pl->record_count);
> > > > +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> > > > +	size_t pl_size = sizeof(payload);
> > > > +	struct cxl_mbox_cmd mbox_cmd;
> > > > +	u16 cnt;
> > > > +	int rc;
> > > > +	int i;
> > > > +
> > > > +	/* Payload size may limit the max handles */
> > > > +	if (pl_size > cxlds->payload_size) {
> > > > +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> > > > +		pl_size = cxlds->payload_size;  
> > 
> > pl_size is only the max size possible if that size was smaller than the size of
> > the record [sizeof(payload) above].
> >   
> > > > +	}
> > > > +
> > > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > > > +		.payload_in = &payload,
> > > > +		.size_in = pl_size,  
> > > 
> > > This payload size should be whatever we need to store the records,
> > > not the max size possible.  Particularly as that size is currently
> > > bigger than the mailbox might be.  
> > 
> > But the above check and set ensures that does not happen.
> >   
> > > 
> > > It shouldn't fail (I think) simply because a later version of the spec might
> > > add more to this message and things should still work, but definitely not
> > > good practice to tell the hardware this is much longer than it actually is.  
> > 
> > I don't follow.
> > 
> > The full payload is going to be sent even if we are just clearing 1 record
> > which is inefficient but it should never overflow the hardware because it is
> > limited by the check above.
> > 
> > So why would this be a problem?
> >   
> 
> per spec3.0, Event Record Handles field is "A list of Event Record Handles the 
> host has consumed and the device shall now remove from its internal Event Log 
> store.". Extra unused handle list does not folow above description. And also 
> spec mentions "All event record handles shall be nonzero value. A value of 0 
> shall be treated by the device as an invalid handle.". So if there is value 0 
> in extra unused handles, device shall return invalid handle error code

I don't think we call into that particular corner as the number of event
record handles is set correctly.  Otherwise I agree this isn't following the
spec - though I think key here is that it won't be broken against CXL 3.0 devices
(with that rather roundabout argument that a CXL 3.0 devices should handle later
spec messages as those should be backwards compatible) but it might be broken
against CXL 3.0+ ones that interpret the 0s at the end as having meaning.

Thanks,

Jonathan

> 
> 
> > > 
> > >   
> > > > +	};
> > > > +
> > > > +	/*
> > > > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > > > +	 * Record can return up to 0xffff records.
> > > > +	 */
> > > > +	i = 0;
> > > > +	for (cnt = 0; cnt < total; cnt++) {
> > > > +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> > > > +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> > > > +			log, le16_to_cpu(payload.handle[i]));
> > > > +
> > > > +		if (i == max_handles) {
> > > > +			payload.nr_recs = i;
> > > > +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > > +			if (rc)
> > > > +				return rc;
> > > > +			i = 0;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	/* Clear what is left if any */
> > > > +	if (i) {
> > > > +		payload.nr_recs = i;
> > > > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > > +		if (rc)
> > > > +			return rc;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}  
> > > 
> > > 
> > > ...
> > >   
> > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > > index ab138004f644..dd9aa3dd738e 100644
> > > > --- a/drivers/cxl/cxlmem.h
> > > > +++ b/drivers/cxl/cxlmem.h  
> > > 
> > > ...
> > >   
> > > > +
> > > > +/*
> > > > + * Clear Event Records input payload
> > > > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > > > + */
> > > > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > > > +struct cxl_mbox_clear_event_payload {
> > > > +	u8 event_log;		/* enum cxl_event_log_type */
> > > > +	u8 clear_flags;
> > > > +	u8 nr_recs;
> > > > +	u8 reserved[3];
> > > > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];  
> > > 
> > > Doesn't fit in the smallest possible payload buffer.
> > > It's 526 bytes long.  Payload buffer might be 256 bytes in total.
> > > (8.2.8.4.3 Mailbox capabilities)
> > > 
> > > Lazy approach, make this smaller and do more loops when clearing.
> > > If we want to optimize this later can expand it to this size.  
> > 
> > I agree but the code already checks for and adjusts this on the fly based on
> > cxlds->payload_size?
> > 
> >  +	/* Payload size may limit the max handles */
> >  +	if (pl_size > cxlds->payload_size) {
> >  +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> >  +		pl_size = cxlds->payload_size;
> >  +	}
> > 
> > Why is this not ok?  [Other than being potentially inefficient.]
> > 
> > Do you have a patch to qemu which causes this?
> > 
> > Ira
> >   
> > > > +} __packed;
> > > > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > > > +	(((payload_size) -						\
> > > > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > > > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> > > > +		sizeof(__le16))
> > > > +  
> > > 
> > > ...
> > >   
> >   
>