On Mon, Jun 02, 2014 at 04:27:52PM +0100, Jan Beulich wrote: > >>> On 02.06.14 at 17:16, <kirill@xxxxxxxxxxxxx> wrote: > > On Mon, Jun 02, 2014 at 03:46:10PM +0100, Jan Beulich wrote: > >> For cold page allocations using the normal clear_highpage() mechanism > >> may be inefficient on certain architectures, namely due to needlessly > >> replacing a good part of the data cache contents. Introduce an arch- > >> overridable clear_cold_highpage() (using streaming non-temporal stores > >> on x86, where an override gets implemented right away) to make use of > >> in this specific case. > >> > >> Leverage the impovement in the Xen balloon driver, eliminating the > >> explicit scrub_page() function. > > > > Any benchmark data? > > > > I've tried non-temporal stores to clear huge pages, but it didn't helped > > much. I believe it can vary between micro-architectures, but we need > > numbers. I've played with Westmere that time. > > It's not at all clear to me what to measure here - after all this isn't > about improving the page clearing latency or throughput, but about > avoiding to disturb other operations. It would be nice to find a workload which benefits from not trashing cache from page allocator. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>