Re: hugepages will matter more in the future

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 13, 2010 at 01:38:25PM +0200, Ingo Molnar wrote:
> 
> * Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote:
> 
> > On Mon, Apr 12, 2010 at 04:22:30AM -0700, Arjan van de Ven wrote:
> > >
> > > Now hugepages have some interesting other advantages, namely they save 
> > > pagetable memory..which for something like TPC-C on a fork based database 
> > > can be a measureable win.
> > 
> > It doesn't save pagetable memory (as in `grep MemFree /proc/meminfo`). [...]
> 
> It does save in terms of CPU cache footprint. (which the argument was about) 
> The RAM is wasted, but are always cache cold.

Definitely, thanks for further clarifying this, and this is why I've
been careful to specify "as in `grep MemFree..".

> i think it's very much interesting for 'pure' hugetlb mappings, as a next-step 
> thing. It amounts to 8 bytes wasted per 4K page [0.2% of RAM wasted] - much 
> more with the kind of aliasing that DBs frequently do - for hugetlb workloads 
> it is basically roughly equivalent to a +8 bytes increase in struct page size 
> - few MM hackers would accept that.
> 
> So it will have to be fixed down the line.

It's exactly 4k wasted for each pmd set as pmd_trans_huge. Removing
the pagetable preallocation will be absolutely trivial as far as
huge_memory.c is concerned (takes like 1 minute of hacking) and in
fact it simplifies a bit of the code, what will be not trivial will be
to handle the -ENOMEM retval from every place that calls
split_huge_page_pmd, which definitely we can address down the line
(ideally by removing split_huge_page_pmd). The other benefit the
current preallocation provides, is that it doesn't increase
requirements from the PF_MEMALLOC pool, until we can swap hugepages
natively with huge-swapcache, in order to swap we need to allocate the
pte.

Who tried this before (Dave IIRC) answered some email ago that he also
had to preallocate the pte to avoid running into the above issue. When
he said that, it further confirmed me that it's worth to go this way
initially. Also note: we're not wasting memory compared to when pmd is
not huge, we just don't take advantage of the full potential of
hugepages to keep things more manageable initially.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]