On Sat, Sep 29, 2012 at 07:30:06AM -0700, Andi Kleen wrote: > On Sat, Sep 29, 2012 at 03:48:11PM +0200, Andrea Arcangeli wrote: > > On Sat, Sep 29, 2012 at 02:37:18AM +0300, Kirill A. Shutemov wrote: > > > Cons: > > > - increases TLB pressure; > > > > I generally don't like using 4k tlb entries ever. This only has the > > From theory I would also prefer the 2MB huge page. > > But some numbers comparing between the two alternatives are definitely > interesting. Numbers are often better than theory. Sure good idea, just all standard benchmarks likely aren't using zero pages so I suggest a basic micro benchmark: some loop of() { memcmp(uninitalized_pointer, (char *)uninitialized_pointer+4G, 4G) barrier(); } > > > There would be a small cache benefit here... but even then some first > > level caches are virtually indexed IIRC (always physically tagged to > > Modern x86 doesn't have virtually indexed caches. With the above memcmp, I'm quite sure the previous patch will beat the new one by a wide margin, especially on modern x86 with more 2M TLB entries and >= 8MB L2 caches. But I agree we need to verify it before taking a decision, and that the numbers are better than theory, or to rephrase it "let's check the theory is right" :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>