On Fri, 15 Feb 2013 18:13:04 -0500 Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > On Fri, Feb 15, 2013 at 01:27:38PM -0800, Andrew Morton wrote: > > On Fri, 15 Feb 2013 01:34:50 -0500 > > Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > > + * The status is returned in a vector of bytes. The least significant > > > + * bit of each byte is 1 if the referenced page is in memory, otherwise > > > + * it is zero. > > > > Also, this is going to be dreadfully inefficient for some obvious cases. > > > > We could address that by returning the info in some more efficient > > representation. That will be run-length encoded in some fashion. > > > > The obvious way would be to populate an array of > > > > struct page_status { > > u32 present:1; > > u32 count:31; > > }; > > > > or whatever. > > I'm having a hard time seeing how this could be extended to more > status bits without stifling the optimization too much. See other email: add a syscall arg which specifies the boolean status which we're searching for. > If we just > add more status bits to one page_status, the likelihood of long runs > where all bits are in agreement decreases. But as the optimization > becomes less and less effective, we are stuck with an interface that > is more PITA than just using mmap and mincore again. > > The user has to supply a worst-case-sized vector with one struct > page_status per page in the range, but the per-page item will be > bigger than with the byte vector because of the additional run length > variable. Yes, we'd need to tell the kernel how much storage is available for the structures. > However, one struct page_status per run leaves you with a worst case > of one syscall per page in the range. Yes. > I dunno. The byte vector might not be optimal but its worst cases > seem more attractive, is just as extensible, and dead simple to use. But I think "which pages from this 4TB file are in core" will not be an uncommon usage, and writing a gig of memory to find three pages is just awful. I wonder what the most common usage would be (one should know this before merging the syscall :)). I guess "is this relatively-small range of the file in core" and/or "which pages from this relatively-small range of the file will I need to read", etc. The syscall should handle the common usages very well. But it shouldn't handle uncommon usages very badly! -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html