Flushing each cache line explicity has better performance than using non-temporal stores (for transfers larger than 512 bytes). This patch improves throughput of the dm-writecache driver: block size 512 1024 2048 4096 movnti 496 MB/s 642 MB/s 725 MB/s 744 MB/s clflushopt 373 MB/s 688 MB/s 1.1 GB/s 1.2 GB/s Note that movnti (used by memcpy_flushcache) has better performance in multithreaded access, that's why it may be better to make this change in the dm-writecache driver rather than changing memcpy_flushcache. Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx> --- drivers/md/dm-writecache.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) Index: linux-2.6/drivers/md/dm-writecache.c =================================================================== --- linux-2.6.orig/drivers/md/dm-writecache.c +++ linux-2.6/drivers/md/dm-writecache.c @@ -1140,7 +1140,16 @@ static void bio_copy_block(struct dm_wri } } else { flush_dcache_page(bio_page(bio)); - memcpy_flushcache(data, buf, size); +#if defined(CONFIG_X86) + if (static_cpu_has(X86_FEATURE_CLFLUSHOPT) && likely(size > 512) && likely(boot_cpu_data.x86_clflush_size == 64)) { + unsigned long i; + for (i = 0; i < size; i += 64) { + memcpy(data + i, buf + i, 64); + clflushopt(data + i); + } + } else +#endif + memcpy_flushcache(data, buf, size); } bvec_kunmap_irq(buf, &flags); -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel