On Mon, Aug 11, 2014 at 11:55:32AM -0700, Darrick J. Wong wrote: > I was expecting 16 groups (32M readahead) to win, but as the observations in my > spreadsheet show, 2MB tends to win. I _think_ the reason is that if we > encounter indirect map blocks or ETB blocks, they tend to be fairly close to > the file blocks in the block group, and if we're trying to do a large readahead > at the same time, we end up with a largeish seek penalty (half the flexbg on > average) for every ETB/map block. Hmm, that might be an argument for not trying to increase the flex_bg size, since we want to keep seek distances within a flex_bg to be dominated by settling time, and not by the track-to-track accelleration/coasting/deaccelleration time. > I figured out what was going on with the 1TB SSD -- it has a huge RAM cache big > enough to store most of the metadata. At that point, reads are essentially > free, but readahead costs us ~1ms per fadvise call. Do we understand why fadvise() takes 1ms? Is that something we can fix? And readahead(2) was even worse, right? - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html