Re: Optimizing mmap_queue on AVX/AVX2 CPUs

Sitsofe Wheeler <sitsofe@xxxxxxxxx> · Tue, 12 Sep 2017 00:03:56 +0100

On 7 September 2017 at 21:24, Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote:
> On 7 September 2017 at 07:52, Rebecca Cran <rebecca@xxxxxxxxxxxx> wrote:
>>
>>> On Sep 7, 2017, at 12:28 AM, Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote:
>>>
>>>> On 7 September 2017 at 07:06, Rebecca Cran <rebecca@xxxxxxxxxxxx> wrote:
>>>> That does make sense: to see the difference you just need to copy data between areas of memory.
>>>
>>> I can't help but be reminded of Linus comment over on
>>> https://bugzilla.redhat.com/show_bug.cgi?id=638477#c46 .
>>
>> Hmm, are you suggesting by that it's not something we should try and optimize within fio?
>
> No the opposite - that a non libc memcpy may out perform the libc one
> (even if it looks simpler in some cases)!
>
>> I can totally understand that, and I'd be willing to put off any further work on this until/if we run into issues testing the performance of future NVDIMM-P (i.e. Storage Class Memory) devices.
>
> It's not my intent to put you off - all your ideas sound good!

A faster memcpy looks like something of a holy grail and there's all
sort of replacements floating around the net. The most interesting
thing I've come across so far is that sometimes memove is faster than
memcpy. Seems very system dependent but here's what the Eigen project
did: https://bitbucket.org/eigen/eigen/pull-requests/292/adds-a-fast-memcpy-function-to-eigen/diff
.

-- 
Sitsofe | http://sucs.org/~sits/
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html