On Thu, Jan 20, 2011 at 01:44:57PM +0800, Li, Shaohua wrote: > On Thu, 2011-01-20 at 12:41 +0800, Dave Chinner wrote: > > On Wed, Jan 19, 2011 at 08:10:14PM -0800, Andrew Morton wrote: > > > On Thu, 20 Jan 2011 11:21:49 +0800 Shaohua Li <shaohua.li@xxxxxxxxx> wrote: > > > > > > > > It seems to return a single offset/length tuple which refers to the > > > > > btrfs metadata "file", with the intent that this tuple later be fed > > > > > into a btrfs-specific readahead ioctl. > > > > > > > > > > I can see how this might be used with say fatfs or ext3 where all > > > > > metadata resides within the blockdev address_space. But how is a > > > > > filesytem which keeps its metadata in multiple address_spaces supposed > > > > > to use this interface? > > > > Oh, this looks like a big problem, thanks for letting me know such > > > > filesystems. is it possible specific filesystem mapping multiple > > > > address_space ranges to a virtual big ranges? the new ioctls handle the > > > > mapping. > > > > > > I'm not sure what you mean by that. > > > > > > ext2, minix and probably others create an address_space for each > > > directory. Heaven knows what xfs does (for example). > > > > In 2.6.39 it won't even use address spaces for metadata caching. > > > > Besides, XFS already has pretty sophisticated metadata readahead > > built in - it's one of the reasons why the XFS directory code scales > > so well on cold cache lookups of arge directories - so I don't see > > much need for such an interface for XFS. > > > > Perhaps btrfs would be better served by implementing speculative > > metadata readahead in the places where it makes sense (e.g. readdir) > > bcause it will improve cold-cache performance on a much wider range > > of workloads than at just boot-time.... > I don't know about xfs. A sophisticated metadata readahead might make > metadata async, but I thought it's impossible it can removes the disk > seek. Since metadata and data usually lives in different disk block > ranges, doing data readahead will unavoidable read metadata and cause > disk seek between reading data and metadata. It's standard practice to do in-kernel heuristic readahead for large directories. It's irrelevant to data/metadata interleaving. It's exactly interleaved reads that makes readahead a must-have. Think about interleavingly reading 2+ large files :) Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html