Re: [PATCH 01/11 RESEND] libe2p: Add new function get_fragment_score()

"Ted Ts'o" <tytso@xxxxxxx> · Tue, 21 Jun 2011 09:56:08 -0400

On Tue, Jun 21, 2011 at 08:26:25PM +0900, Kazuya Mio wrote:
> 
> I decided to implement a fragmentation score for the two purposes:
> one is for filefrag that outputs the score to decide which files should be
> defragmented, and the other is for e4defrag that compares two files'
> fragmentation to prevent the worse fragmentation.

I'm really nervous about having filefrag print a "fragmentation
score".  The problem is that the problem is invariably far more
complex than can be boiled into a single number, and so users look at
it and start worrying when they shouldn't.

And the statement, "so that e4defrag can compare two files'
fragmentation to prevent the worse fragmentation" begs the question of
what is "worse".  The real issue here is that it's a multidimensional
problem.

> Certainly, the same fragmentation score doesn't always mean the same
> fragmentation. Just as Andreas said, "fragments per MB" is a good idea. It's
> easy to understand, and other filesystem also would be able to use it without
> change. Moreover, there is no worry about what threshold we use to
> the application.

"fragments per megabyte" is definitely better, especially if you
disregard the tail.  It's worth consider how it works for files
smaller than a megabyte.  Do you round the file size up to the nearest
megabyte?  Is it an integer score, or does it need to be floating
point?  An integer score where the size is rounded up to the nearest
megabyte sounds like a best plan, but I'm sure we could still find
some interesting non-linearities that lead to surprising results.

	      	    	    	      	   - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html