Re: memcpy test

Jens Axboe <axboe@xxxxxxxxx> · Fri, 1 Dec 2017 13:00:03 -0700

On 12/01/2017 12:56 PM, Jens Axboe wrote:
> On 12/01/2017 11:56 AM, Rebecca Cran wrote:
>>
>>> On Dec 1, 2017, at 11:20 AM, Jens Axboe <axboe@xxxxxxxxx> wrote:
>>>
>>> which is kind of depressing, since the fastest for larger sizes is the
>>> very dumb and basic implementation that you'll find in any text book
>>> under the section of "my first memcpy".
>>>
>>> Anyway, for evaluating implementations, we need a way to test them,
>>> and now we have. I'll be happy to take input/patches on the test
>>> itself.
>>
>> Thanks - I meant to reply a few days ago and tell you I will work on a
>> patch for this. 
>>
>> For the simple case, does the compiler do anything interesting? For
>> example, auto-vectorization should be simple for it to do if it knows
>> the capabilities of the target machine.> 
> Doesn't look like it - it just unrolls it a bit, and then uses movzbl.
> So nothing exciting at all.

Ah hang on, there's more to it. There are unrolled bits for the
unaligned length/sizes, and then it does the majority of the work with
movdqa and movups.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html