On Fri, 18 Jul 2014 16:47:26 +0800 "bhe at redhat.com" <bhe at redhat.com> wrote: > On 07/18/14 at 09:38am, Petr Tesarik wrote: > > On Fri, 18 Jul 2014 10:16:16 +0800 > > "bhe at redhat.com" <bhe at redhat.com> wrote: > > > Thanks, Petr. This help. > > > I'm afraid you misunderstand the concept. Let me explain: > > > > page: > > The basic memory management block (usually 4K). > > > > compound page: > > The kernel may group adjacent pages into a larger object, and track > > page flags, refcount, etc. at one place for the whole group. This is > > called compound pages. > > > > Compound pages may be used for different things, either kernel > > internal allocations or by applications. > > > > Note that compound pages may or may not correspond to fewer levels of > > paging in hardware. Take x86_64 as an example. You can have compound > > pages with order=1 (i.e. 8K). There is nothing in the hardware to > > support such page size, so the page table just contains 2 consecutive > > 4K pages. Only if you have a compound page with order=9 (i.e. 2M), the > > kernel can use 3-level paging, creating a 2M page in hardware. > > > > hugetlbfs: > > Some compound pages are available through a special filesystem. This > > filesystem is used solely by user-space applications. However, these > > pages are formally owned by the kernel (after all, they belong to a > > filesystem, albeit a very special one). There is nothing in the page > > flags to tell that they are in fact used by user-space. > > I do have a confusion on this. Here it means user space benefits from > hugetlb only through hugetlbfs. If except of that, kernel may make use > of hugetlb to get large memory by merging continuous pages, doesn't it? Well, yes, it _is_ confusing: 1. hugetlbfs makes use of compound pages, but not every compound page is a hugetlbfs page, even if it is implemented with fewer levels of paging in hardware. 2. The kernel does not use hugetlbfs for its allocations. The kernel does use compound pages in many places. 3. Pages assigned to hugetlbfs are not marked as user-space. Nevertheless, users of makedumpfile expect that these pages are filtered out if they say to filter out "user data". Support for compound pages is (relatively) easy - just check the head page to determine the order and then act on the group as a whole (using the information from the head page). But this not enough to treat hugetlbfs pages as "user data", so makedumpfile must have a way to tell a hugetlbfs compound page from other compound pages, and page flags do not help in that regard. Petr T