On Tue, Jun 21, 2011 at 08:26:25PM +0900, Kazuya Mio wrote: > > I decided to implement a fragmentation score for the two purposes: > one is for filefrag that outputs the score to decide which files should be > defragmented, and the other is for e4defrag that compares two files' > fragmentation to prevent the worse fragmentation. I'm really nervous about having filefrag print a "fragmentation score". The problem is that the problem is invariably far more complex than can be boiled into a single number, and so users look at it and start worrying when they shouldn't. And the statement, "so that e4defrag can compare two files' fragmentation to prevent the worse fragmentation" begs the question of what is "worse". The real issue here is that it's a multidimensional problem. > Certainly, the same fragmentation score doesn't always mean the same > fragmentation. Just as Andreas said, "fragments per MB" is a good idea. It's > easy to understand, and other filesystem also would be able to use it without > change. Moreover, there is no worry about what threshold we use to > the application. "fragments per megabyte" is definitely better, especially if you disregard the tail. It's worth consider how it works for files smaller than a megabyte. Do you round the file size up to the nearest megabyte? Is it an integer score, or does it need to be floating point? An integer score where the size is rounded up to the nearest megabyte sounds like a best plan, but I'm sure we could still find some interesting non-linearities that lead to surprising results. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html