On Sunday 15 March 2009 14:24:29 Daniel Phillips wrote: > On Thursday 12 March 2009, Matthew Wilcox wrote: > > On Fri, Mar 13, 2009 at 12:04:40AM +1100, Nick Piggin wrote: > > > As far as the per-block pagecache state (as opposed to the per-block fs > > > state), I don't see any reason it is a problem for efficiency. We have > > > to do per-page operations anyway. > > > > Why? Wouldn't it be nice if we could do arbitrary extents? I suppose > > superpages or soft page sizes get us most of the way there, but the > > rounding or pieces at the end are a bit of a pain. Sure, it'll be a > > huge upheaval for the VM, but we're good at huge upheavals ;-) > > Actually, filesystem extents tend to erode the argument for superpages. > There are three reasons we have seen for wanting big pages: 1) support > larger block buffers without adding messy changes to buffer.c; 2) TLB > efficiency; 3) less per-page state in kernel memory. TLB efficiency is > only there if the hardware supports it, which X86 arguably doesn't. > The main argument for larger block buffers is less per-block transfer > setup overhead, but the BIO model combined with filesystem extents > does that job better, or at least it will when filesystems learn to > take better advantage of this. > > VM extents on the other hand could possibly do a really good job of > reducing per-page VM overhead, if anybody still cares about that now > that 64 bit machines rule the big iron world. > > I expect implementing VM extents to be a brutally complex project, as > filesystem extents always turn out to be, even though one tends to > enter such projects thinking, how hard could this be? Answer: harder > than you think. But VM extents would be good for a modest speedup, so > somebody is sure to get brave enough to try it sometime. I don't think there is enough evidence to be able to make such an assertion. When you actually implement extent splitting and merging in a deadlock free manner and synchronize everything properly I wouldn't be surprised if it is slower most of the time. If it was significantly faster, then memory fragmentation means that it is going to get significantly slower over the uptime of the kernel, so you would have to virtually map the kernel and implement memory defragmentation, at which point you get even slower and more complex. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html