Re: [PATCH v2 RESEND] x86: optimize memcpy_flushcache

Mikulas Patocka <mpatocka@xxxxxxxxxx> · Thu, 21 Jun 2018 21:19:27 -0400 (EDT)

On Thu, 21 Jun 2018, Ingo Molnar wrote:

> 
> * Mike Snitzer <snitzer@xxxxxxxxxx> wrote:
> 
> > From: Mikulas Patocka <mpatocka@xxxxxxxxxx>
> > Subject: [PATCH v2] x86: optimize memcpy_flushcache
> > 
> > In the context of constant short length stores to persistent memory,
> > memcpy_flushcache suffers from a 2% performance degradation compared to
> > explicitly using the "movnti" instruction.
> > 
> > Optimize 4, 8, and 16 byte memcpy_flushcache calls to explicitly use the
> > movnti instruction with inline assembler.
> 
> Linus requested asm optimizations to include actual benchmarks, so it would be 
> nice to describe how this was tested, on what hardware, and what the before/after 
> numbers are.
> 
> Thanks,
> 
> 	Ingo

It was tested on 4-core skylake machine with persistent memory being 
emulated using the memmap kernel option. The dm-writecache target used the 
emulated persistent memory as a cache and sata SSD as a backing device. 
The patch results in 2% improved throughput when writing data using dd.

I don't have access to the machine anymore.

Mikulas

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel