On Wed, Nov 08, 2017 at 07:33:09AM -0500, Mikulas Patocka wrote: > We could use the function clwb() (or arch-independent wrapper dax_flush()) > - that uses the clflushopt instruction on Broadwell or clwb on Skylake - > but it is very slow, write performance on Broadwell is only 350MB/s. > > So in practice I use the movnti instruction that bypasses cache. The > write-combining buffer is flushed with sfence. And what do you do for an architecture with virtuall indexed caches? -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel