Re: [PATCHSET v6b 0/11] Turn single segment imports into ITER_UBUF

Jens Axboe <axboe@xxxxxxxxx> · Thu, 30 Mar 2023 16:18:58 -0600

On 3/30/23 3:53 PM, Linus Torvalds wrote:
> On Thu, Mar 30, 2023 at 10:33 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>> That said, there might be things to improve here. But that's a task
>> for another time.
> 
> So I ended up looking at this, and funnily enough, the *compat*
> version of the "copy iovec from user" is actually written to be a lot
> more efficient than the "native" version.
> 
> The reason is that the compat version has to load the data one field
> at a time anyway to do the conversion, so it open-codes the loop. And
> it does it all using the efficient "user_access_begin()" etc, so it
> generates good code.
> 
> In contrast, the native version just does a "copy_from_user()" and
> then loops over the result to verify it. And that's actually pretty
> horrid. Doing the open-coded loop that fetches and verifies the iov
> entries one at a time should be much better.
> 
> I dunno. That's my gut feel, at least. And it may explain why your
> "readv()" benchmark has "_copy_from_user()" much higher up than the
> "read()" case.
> 
> Something like the attached *may* help.
> 
> Untested - I only checked the generated assembly to see that it seems
> to be sane, but I might have done something stupid. I basically copied
> the compat code, fixed it up for non-compat types, and then massaged
> it a bit more.

That's a nice improvement - about 6% better for the single vec case,
And that's the full "benchmark". Here are the numbers in usec for
the read-zero. Lower is better, obviously.

-git
1793883
1809305
1782602
1777280
1803978
1798792
1791190
1802017
1804558
1813370
1807696
1785887
1785506
1789876
1780018
1793932
1803655
1798186

-git+patch
1685393
1685891
1688886
1679967
1687551
1693233
1684883
1688779
1682103
1684944
1686928
1687984
1686729
1687009
1684660
1687295
1684893
1685309

-- 
Jens Axboe