On Thu, Jul 14, 2022 at 01:05:47PM -0700, Ira wrote: > On Thu, Jul 14, 2022 at 09:27:04AM -0700, Dan Williams wrote: > > ira.weiny@ wrote: > > > From: Ira Weiny <ira.weiny@xxxxxxxxx> > > > > > > The CDAT read may fail for a number of reasons but mainly it is possible > > > to get different parts of a valid state. The checksum in the CDAT table > > > protects against this. > > > > I don't know what "different parts of a valid state" means. > > This text is stale but given what I know about how other entities may be > issuing queries without the kernel knowledge I'm not 100% sure that the data > read back will always be valid. > > Regardless, this has already caught a bug in QEMU. > > So I'm inclined to leave this check in because the checksum is there and should > can be validated if only to detect broken hardware. > > I can update the commit message to clarify this. Oh wait I thought this was the 'is valid' patch. I can remove the retries if that was all you were concerned about. Ira > > Ira > > > > > The CDAT > > should not be changing as it is being read unless someone is issuing a > > set-partition while the DOE operation is happening. Rather than > > arbitrary retries, block out set-partition while CDAT is being read. > > > > You can use {set,clear}_exclusive_cxl_commands() to temporarily lock out > > set-partition while the CDAT read is happening. > > > > ...and since this series is only for enabling