Re: [PATCH] libata: Whitelist SSDs that are known to properly return zeroes after TRIM

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Wed, 10 Dec 2014 23:34:39 +0300

On Wed, 2014-12-10 at 09:29 -0500, Tejun Heo wrote:
> Hello,
> 
> On Wed, Dec 10, 2014 at 07:09:38AM +0300, James Bottomley wrote:
> > Conversely, drives that return random junk after a trim cause
> > verification failures, so we just elect not to transmit trim down to
> > them from the RAID layer.
> 
> I see.  Thanks for the explanation.  I suppose the current raid
> implementations' metadata handling on partially built devices isn't
> developed enough to simply mark those trimmed blocks as unsynced?

We do have a log, which could be used for RAID-1 for this purpose, but
it doesn't seem to be much used in practise.  It takes extra space which
most admins don't account for.

For RAID-2+ or erasure codes this won't work because a bad block read
corrupts the stripe: the really subtle failure here is that you trim a
stripe and then partially write it, the RMW you do for parity will be an
incorrect partial syndrome because it's based on junk rather than the
syndrome of the other blocks and the corruption wouldn't be detected
until you get an actual disk failure (meaning everything will look fine
until that crucial day you need your data protection mechanism to work).
We could cope with this with an even more subtle logging mechanism,
where we only trim stripes and then log trimmed stripes and insist on
full instead of partial writes so we get back to a known syndrome, but
that's introducing a lot of subtlety into the logging code ...

James

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html