Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 18, 2022 at 03:55:53PM +0000, Jonathan Cameron wrote:
> On Sun, 18 Dec 2022 08:25:34 +0800
> johnny <johnny.li@xxxxxxxxxxxxxxxx> wrote:
> 

[snip]

> > >   
> > > > > +	}
> > > > > +
> > > > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > > > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > > > > +		.payload_in = &payload,
> > > > > +		.size_in = pl_size,  
> > > > 
> > > > This payload size should be whatever we need to store the records,
> > > > not the max size possible.  Particularly as that size is currently
> > > > bigger than the mailbox might be.  
> > > 
> > > But the above check and set ensures that does not happen.
> > >   
> > > > 
> > > > It shouldn't fail (I think) simply because a later version of the spec might
> > > > add more to this message and things should still work, but definitely not
> > > > good practice to tell the hardware this is much longer than it actually is.  
> > > 
> > > I don't follow.
> > > 
> > > The full payload is going to be sent even if we are just clearing 1 record
> > > which is inefficient but it should never overflow the hardware because it is
> > > limited by the check above.
> > > 
> > > So why would this be a problem?
> > >   
> > 
> > per spec3.0, Event Record Handles field is "A list of Event Record Handles the 
> > host has consumed and the device shall now remove from its internal Event Log 
> > store.". Extra unused handle list does not folow above description. And also 
> > spec mentions "All event record handles shall be nonzero value. A value of 0 
> > shall be treated by the device as an invalid handle.". So if there is value 0 
> > in extra unused handles, device shall return invalid handle error code
> 
> I don't think we call into that particular corner as the number of event
> record handles is set correctly.  Otherwise I agree this isn't following the
> spec - though I think key here is that it won't be broken against CXL 3.0 devices
> (with that rather roundabout argument that a CXL 3.0 devices should handle later
> spec messages as those should be backwards compatible) but it might be broken
> against CXL 3.0+ ones that interpret the 0s at the end as having meaning.

I'm respining this to add the pci_set_master() anyway.  So I'm going to change
this as well.  I really don't see how hardware would go off anything but the
number of records to process the handles I could see some overly strict
firmware wanting to validate the size being exactly equal to the number
specified rather than just less than (which is what I would anticipate an issue
with).

Dan has agreed to land the movement of the trace point definition to
drivers/cxl patch I need to cxl/next.  After that I will rebase and send out.

Ira

> 
> Thanks,
> 
> Jonathan
> 



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux