On 19/07/2011 11:29, Lutz Vieweg wrote:
On 07/18/2011 10:18 PM, David Brown wrote:
You don't need to fill an erase block for writing - writes are done
as write blocks (I think 4K is the norm).
You are right on that. Those sectors in a partially used erase block
that have not been written to since the last erase of the whole erase
block can be written to as good as sectors in completely empty erase
blocks.
My main point about TRIM being expensive is the effect it has on
the block IO queue, regardless of the implementation in the SSD.
Because of those effects on the block-IO-queue, the user-space
work-around we implemented to discard the SSDs our RAID-1s consist of
will not discard "one area on all SSDs at a time", but rather iterate
first through all unused areas on one SSD, then iterate through the
same list of areas on the second SSD.
Do you take the arrays off-line during this process, or at least make
them read-only? If not, how do you ensure that the lists are valid?
The effect of this is very much to our liking: While we can see
near-100%-utilization on one SSD at a time during the discards, the
other SSD will happily service the readers, and even the writes that
go to the /dev/md* device are buffered in main memory long enough
that we do not really see a significantly bad impact on the service.
(This might be different, though, if the discards were done during
peak-write-load times of the day.)
I really hope your SSD's return zeros for TRIM'ed blocks
For RAID-1, the only consequence of not doing so is just that
"data-check" runs may result in a > 0 mismatch_cnt. It does not
destroy any of your data, and as long as I have two SSDs in a RAID,
both of which give a non-error result when reading a sector, I would
have no indication of "which of the returned sector contents to
prefer", anyway.
(I admit that for health monitoring it is useful to have a meaningful
mismatch_cnt.)
and that you are sure all your TRIMs are in full raid stripes -
otherwise you will /seriously/ mess up your raid arrays.
Again, for RAID0/1 (even 10) I don't see why this would harm any
data.
Fair enough for RAID1. Just don't try it with RAID5!
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html