Re: [PATCH 2/2] Add batched discard support for ext4.

Ric Wheeler <rwheeler@xxxxxxxxxx> · Sat, 24 Apr 2010 13:04:53 -0400

On 04/24/2010 11:03 AM, Greg Freemyer wrote:
On Sat, Apr 24, 2010 at 10:43 AM, Eric Sandeen<sandeen@xxxxxxxxxx>  wrote:

Greg Freemyer wrote:

On Sat, Apr 24, 2010 at 9:48 AM, Ric Wheeler<rwheeler@xxxxxxxxxx>  wrote:

On 04/24/2010 09:24 AM, Greg Freemyer wrote:

...

I know I've been arguing against this patch for the single SSD case
and I still think that use case should be handled by userspace as
hdparm/wiper.sh currently does.  In particular for those extreme
scenarios with JBOD SSDs, the user space solution wins because it
knows how to optimize the trim calls via vectorized ranges in the
payload.

I think that you have missed the broader point. This is not on by default,
so you can mount without discard and use whatever user space utility you
like at your discretion.

ric

Ric,

I was trying to say the design should be driven by the large discard
range use case, not the potentially pathological small discard range
use case that would only benefit SSDs.

Greg

Bear in mind that this patch makes the discard range requests substantially
-larger- than what mount -o discard does on ext4 today, in fact that was
a main goal.

If the kernel could assemble vectors of ranges and pass them down, I think it
could be extended to use that as well.

-Eric

Eric/Ric,

I was responding to the Lukas latest post which stated:

==
And also, currently I am rewriting the patch do use rbtree instead of the
bitmap, because there were some concerns of memory consumption. It is a
question whether or not the rbtree will be more memory friendly.
Generally I think that in most "normal" cases it will, but there are some
extreme scenarios, where the rbtree will be much worse. Any comment on
this ?
==

If one optimizes for large discard ranges and ignores the pathological
cases only beneficial to SSDs, then a rbtree wins.

Correct?

Greg

Let's summarize.

1. Everyone agrees that doing larger discard instead of little discards 
is a good thing.

2. Some devices care about this more than others (various types of SSD's 
and arrays have different designs and performance with discards). Some 
devices do small discards well, others don't.

3. How you get to those bigger discards in our implementation - using a 
series of single range requests, using vectored requests, tracking 
extents that can be combined in an rbtree or not - is something that we 
are working on. Using rbtrees versus a bitmap efficiency is about DRAM 
consumption, not performance of the resulting discard on the target.

4. Devices (some devices) can export their preferences in a standard way 
(look in /sys/block/....).

If you want to influence the code, please do try the various options on 
devices you have at hand and report results.  That is what we are doing 
(we includes Lukas, Eric, Jeff and others on this thread) will real 
devices from vendors that have given us access. We are talking to them 
directly and trying out different work loads but certainly welcome real 
world results and suggestions.

Thanks!

Ric

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html