On Tue 24-01-12 14:14:14, Jeff Moyer wrote: > Chris Mason <chris.mason@xxxxxxxxxx> writes: > > >> All three filesystems use the generic mpages code for reads, so they > >> all get the same (bad) I/O patterns. Looks like we need to fix this up > >> ASAP. > > > > Can you easily run btrfs through the same rig? We don't use mpages and > > I'm curious. > > The readahead code was to blame, here. I wonder if we can change the > logic there to not break larger I/Os down into smaller sized ones. > Fengguang, doing a dd if=file of=/dev/null bs=1M results in 128K I/Os, > when 128KB is the read_ahead_kb value. Is there any heuristic you could > apply to not break larger I/Os up like this? Does that make sense? Well, not breaking up I/Os would be fairly simple as ondemand_readahead() already knows how much do we want to read. We just trim the submitted I/O to read_ahead_kb artificially. And that is done so that you don't trash page cache (possibly evicting pages you have not yet copied to userspace) when there are several processes doing large reads. Maybe 128 KB is a too small default these days but OTOH noone prevents you from raising it (e.g. SLES uses 1 MB as a default). Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html