Re: Deduplicate io_*_prep calls?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/24/20 12:12 AM, Andres Freund wrote:
> Hi,
> 
> On 2020-02-23 20:52:26 -0700, Jens Axboe wrote:
>> The fast case is not being deferred, that's by far the common (and hot)
>> case, which means io_issue() is called with sqe != NULL. My worry is
>> that by moving it into a prep helper, the compiler isn't smart enough to
>> not make that basically two switches.
> 
> I'm not sure that benefit of a single switch isn't offset by the lower
> code density due to the additional per-opcode branches.  Not inlining
> the prepare function results in:
> 
> $ size fs/io_uring.o fs/io_uring.before.o
>    text	   data	    bss	    dec	    hex	filename
>   75383	   8237	      8	  83628	  146ac	fs/io_uring.o
>   76959	   8237	      8	  85204	  14cd4	fs/io_uring.before.o
> 
> symbol size
> -io_close_prep 0000000000000066
> -io_connect_prep 0000000000000051
> -io_epoll_ctl_prep 0000000000000051
> -io_issue_sqe 0000000000001101
> +io_issue_sqe 0000000000000de9
> -io_openat2_prep 00000000000000ed
> -io_openat_prep 0000000000000089
> -io_poll_add_prep 0000000000000056
> -io_prep_fsync 0000000000000053
> -io_prep_sfr 000000000000004e
> -io_read_prep 00000000000000ca
> -io_recvmsg_prep 0000000000000079
> -io_req_defer_prep 000000000000058e
> +io_req_defer_prep 0000000000000160
> +io_req_prep 0000000000000d26
> -io_sendmsg_prep 000000000000006b
> -io_statx_prep 00000000000000ed
> -io_write_prep 00000000000000cd
> 
> 
> 
>> Feel free to prove me wrong, I'd love to reduce it ;-)
> 
> With a bit of handholding the compiler can deduplicate the switches. It
> can't recognize on its own that req->opcode can't change between the
> switch for prep and issue. Can be solved by moving the opcode into a
> temporary variable. Also needs an inline for io_req_prep (not surpring,
> it's a bit large).
> 
> That results in a bit bigger code. That's partially because of more
> inlining:
>    text	   data	    bss	    dec	    hex	filename
>   78291	   8237	      8	  86536	  15208	fs/io_uring.o
>   76959	   8237	      8	  85204	  14cd4	fs/io_uring.before.o
> 
> symbol size
> +get_order 0000000000000015
> -io_close_prep 0000000000000066
> -io_connect_prep 0000000000000051
> -io_epoll_ctl_prep 0000000000000051
> -io_issue_sqe 0000000000001101
> +io_issue_sqe 00000000000018fa
> -io_openat2_prep 00000000000000ed
> -io_openat_prep 0000000000000089
> -io_poll_add_prep 0000000000000056
> -io_prep_fsync 0000000000000053
> -io_prep_sfr 000000000000004e
> -io_read_prep 00000000000000ca
> -io_recvmsg_prep 0000000000000079
> -io_req_defer_prep 000000000000058e
> +io_req_defer_prep 0000000000000f12
> -io_sendmsg_prep 000000000000006b
> -io_statx_prep 00000000000000ed
> -io_write_prep 00000000000000cd
> 
> 
> There's still some unnecessary branching on force_nonblocking. The
> second patch just separates the cases needing force_nonblocking
> out. Probably not quite the right structure.
> 
> 
> Oddly enough gcc decides that io_queue_async_work() wouldn't be inlined
> anymore after that. I'm quite doubtful it's a good candidate anyway?
> Seems mighty complex, and not likely to win much. That's a noticable
> win:
>    text	   data	    bss	    dec	    hex	filename
>   72857	   8141	      8	  81006	  13c6e	fs/io_uring.o
>   76959	   8237	      8	  85204	  14cd4	fs/io_uring.before.o
> --- /tmp/before.txt	2020-02-23 21:00:16.316753022 -0800
> +++ /tmp/after.txt	2020-02-23 23:10:44.979496728 -0800
> -io_commit_cqring 00000000000003ef
> +io_commit_cqring 000000000000012c
> +io_free_req 000000000000005e
> -io_free_req 00000000000002ed
> -io_issue_sqe 0000000000001101
> +io_issue_sqe 0000000000000e86
> -io_poll_remove_one 0000000000000308
> +io_poll_remove_one 0000000000000074
> -io_poll_wake 0000000000000498
> +io_poll_wake 000000000000021c
> +io_queue_async_work 00000000000002a0
> -io_queue_sqe 00000000000008cc
> +io_queue_sqe 0000000000000391

That's OK, it's slow path, I'd prefer it not to be inlined.

> Not quite sure what the policy is with attaching POC patches? Also send
> as separate emails?

Fine like this, though easier if you inline the patches so it's easier
to comment on them.

Agree that the first patch looks fine, though I don't quite see why
you want to pass in opcode as a separate argument as it's always
req->opcode. Seeing it separate makes me a bit nervous, thinking that
someone is reading it again from the sqe, or maybe not passing in
the right opcode for the given request. So that seems fragile and it
should go away.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux