On Mon, May 11, 2009 at 11:00 AM, Matthew Wilcox <matthew@xxxxxx> wrote: > On Mon, May 11, 2009 at 10:50:59AM -0400, Theodore Tso wrote: >> On Mon, May 11, 2009 at 10:29:51AM -0400, Ric Wheeler wrote: >> > The key is not at the FS layer - this is an issue for people who RAID >> > these beasts together and want to actually check that the bits are what >> > they should be (say doing a checksum validity check for a stripe). >> >> Good point, yes I can see why they need that. In that case, the >> storage device can't just silently truncate a TRIM request; it would >> have to expose to the OS its alignment requirements. The risk though >> is that more they try push this compleixity into the OS, the higher >> the risk that the OS will simply decide not to take advantage of the >> functionality. Of course, there is the question why anyone would want >> to build a software-raid device on top of a thin-provisioned hardware >> storage unit. :-) > > It's not a problem for people who use Thin Provisioning, it's a problem > for people who want to run RAID-5 on top of SSDs. If you have a sector > whose reads are indeterminate, your parity for that stripe will always > be wrong. Thus my understanding that entire stripe will either be discarded or not by the mdraid layer. And if a discard comes along from above that is smaller than a stripe, then it will tossed by the mdraid layer. And if it is not aligned to the stripe geometry, then the start/end of the discard area will be adjusted to be stripe aligned. And since the mdraid layer is not currently planning to track what has been discarded over time, when a re-shape comes along, it will effectively un-trim everything and rewrite 100% of the FS. The same thing will happen if a drive is cloned via dd as happens pretty routinely. Overall, I think Linux will need a mechanism to scan a filesystem and re-issue all the trim commands in order to get the hardware back in sync a major maintenance activity. That mechanism could either be admin invoked.or a always on maintenance task. Personally, I think the best option is a background task (kernel I assume) to scan the filesystem and issue discards for all the data on a slow but steady basis. If it takes a week to make its way around the disk/volume, then it takes a week. Who really cares. Once you assume you have that background task in place, I'm not sure how important it is to even have the filesystem manage this in realtime with the file deletes. Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html