James Bottomley wrote:
On Fri, 2008-11-07 at 11:11 -0500, Chris Mason wrote:
On Fri, 2008-11-07 at 10:06 -0600, James Bottomley wrote:
On Fri, 2008-11-07 at 11:00 -0500, Martin K. Petersen wrote:
"Chris" == Chris Mason <chris.mason@xxxxxxxxxx> writes:
Chris> Hmmm, it's surprising to me that arrays who tell us please use
Chris> the noop elevator suddenly want us to merge discard requests.
Chris> The array really needs to be able to deal with this internally.
Let's also not forget that we're talking about merging discard
requests for the purpose making internal array housekeeping efficient.
That involves merging discards up to the internal array block sizes
which may be on the order of 512/768/1024 KB.
If we were talking about merging discards up to a 4/8/16 KB boundary
that might be something we'd have a chance to do within a reasonable
amount of time (bigger than normal read/write I/O but not hours).
But keeping discard state around for long enough to attempt to
aggregate 768KB (and 768KB-aligned) chunks is icky.
Icky but possible. It's the same rb tree affair we use to keep vma
lists (with the same characteristics). The point is that technically we
can do this pretty easily ... all the way down to not losing any
potential discards that the array would ignore. However, procedurally
it would certainly be sending the wrong message to the array vendors
(the message being "sure the OS will sanitise any crap you care to
dump").
On the other hand, if we have to do it for flash and MMC anyway ...
It doesn't seem like a good idea to maintain a ton of code that gets
exercised so rarely, especially wrt filesystem crashes.
Heh, am I the only person here who deletes files on a regular basis
(principally to get my disk down from 99%)?
Just testing it would be a fairly large challenge, spread out across N
filesystems. I think we need to keep discard as simple as we possibly
can.
I don't disagree with that ... I'm not saying we *should* merely that we
*could*.
James
I agree that simple and robust are key, but we will need to try and do
reasonable coalescing of the requests.
Depending on how vendors implement those unmap commands, sending down a
sequence of commands might cause a performance issue if done at too fine
a granularity. Easiest way to handle that is to make sure that we have a
way of disabling the unmap/discard support (mount option?).
Ric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html