Re: Optimizing mmap_queue on AVX/AVX2 CPUs

Rebecca Cran <rebecca@xxxxxxxxxxxx> · Wed, 6 Sep 2017 14:54:43 -0600

> On Sep 6, 2017, at 2:20 PM, Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote:

> Does that mean your assembly copy is better than memcpy on generic
> data going memory-memory or is is it just in relation to copying to
> block devices?

I'm testing memory-based filesystems (mounted with DAX) using the mmap ioengine - either against an NVDIMM-N DDR4 module or on FreeBSD against an md device.

Both my code using assembly intrinsincs and standard loops optimized with -ftree-vectorize are better than generic memcpy.

-- 
Rebecca
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html