On Wed, Jul 13, 2022 at 09:14:42AM +0000, Chaitanya Kulkarni wrote: > On 7/6/22 10:42, Matthew Wilcox wrote: > > On Thu, Jun 30, 2022 at 02:14:00AM -0700, Chaitanya Kulkarni wrote: > >> This adds support for the REQ_OP_VERIFY. In this version we add > > > > IMO, VERIFY is a useless command. The history of storage is full of > > devices which simply lie. Since there's no way for the host to check if > > the device did any work, cheap devices may simply implement it as a NOOP. > > Thanks for sharing your feedback regarding cheap devices. > > This falls outside of the scope of the work, as scope of this work is > not to analyze different vendor implementations of the verify command. The work is pointless. As a customer, I can't ever use the VERIFY command because I have no reason for trusting the outcome. And there's no way for a vendor to convince me that I should trust the result. > > Even expensive devices where there's an ironclad legal contract between > > the vendor and customer may have bugs that result in only some of the > > bytes being VERIFYed. We shouldn't support it. > This is not true with enterprise SSDs, I've been involved with product > qualification of the high end enterprise SSDs since 2012 including good > old non-nvme devices with e.g. skd driver on linux/windows/vmware. Oh, I'm sure there's good faith at the high end. But bugs happen in firmware, and everybody knows it. > > Now, everything you say about its value (not consuming bus bandwidth) > > is true, but the device should provide the host with proof-of-work. > > Yes that seems to be missing but it is not a blocker in this work since > protocol needs to provide this information. There's no point in providing access to a feature when that feature is not useful. > We can update the respective specification to add a log page which > shows proof of work for verify command e.g. > A log page consist of the information such as :- > > 1. How many LBAs were verified ? How long it took. > 2. What kind of errors were detected ? > 3. How many blocks were moved to safe location ? > 4. How much data (LBAs) been moved successfully ? > 5. How much data we lost permanently with uncorrectible errors? > 6. What is the impact on the overall size of the storage, in > case of flash reduction in the over provisioning due to > uncorrectible errors. That's not proof of work. That's claim of work. > > I'd suggest calculating some kind of checksum, even something like a > > SHA-1 of the contents would be worth having. It doesn't need to be > > crypto-secure; just something the host can verify the device didn't spoof. > > I did not understand exactly what you mean here. The firmware needs to prove to me that it *did something*. That it actually read those bytes that it claims to have verified. The simplest way to do so is to calculate a hash over the blocks which were read (maybe the host needs to provide a nonce as part of the VERIFY command so the drive can't "remember" the checksum).