Re: raid0 vs io_uring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/14/21 20:23, Jens Axboe wrote:
On 11/14/21 10:07 AM, Avi Kivity wrote:
Running a trivial randread, direct=1 fio workload against a RAID-0
composed of some nvme devices, I see this pattern:


               fio-7066  [009]  1800.209865: function: io_submit_sqes
               fio-7066  [009]  1800.209866: function:
rcu_read_unlock_strict
               fio-7066  [009]  1800.209866: function:
io_submit_sqe
               fio-7066  [009]  1800.209866: function:
io_init_req
               fio-7066  [009]  1800.209866:
function:                      io_file_get
               fio-7066  [009]  1800.209866:
function:                         fget_many
               fio-7066  [009]  1800.209866:
function:                            __fget_files
               fio-7066  [009]  1800.209867:
function:                               rcu_read_unlock_strict
               fio-7066  [009]  1800.209867: function:
io_req_prep
               fio-7066  [009]  1800.209867:
function:                      io_prep_rw
               fio-7066  [009]  1800.209867: function:
io_queue_sqe
               fio-7066  [009]  1800.209867:
function:                      io_req_defer
               fio-7066  [009]  1800.209867:
function:                      __io_queue_sqe
               fio-7066  [009]  1800.209868:
function:                         io_issue_sqe
               fio-7066  [009]  1800.209868:
function:                            io_read
               fio-7066  [009]  1800.209868:
function:                               io_import_iovec
               fio-7066  [009]  1800.209868:
function:                               __io_file_supports_async
               fio-7066  [009]  1800.209868:
function:                                  I_BDEV
               fio-7066  [009]  1800.209868:
function:                               __kmalloc
               fio-7066  [009]  1800.209868:
function:                                  kmalloc_slab
               fio-7066  [009]  1800.209868: function: __cond_resched
               fio-7066  [009]  1800.209868: function:
rcu_all_qs
               fio-7066  [009]  1800.209869: function: should_failslab
               fio-7066  [009]  1800.209869:
function:                               io_req_map_rw
               fio-7066  [009]  1800.209869:
function:                         io_arm_poll_handler
               fio-7066  [009]  1800.209869:
function:                         io_queue_async_work
               fio-7066  [009]  1800.209869:
function:                            io_prep_async_link
               fio-7066  [009]  1800.209869:
function:                               io_prep_async_work
               fio-7066  [009]  1800.209870:
function:                            io_wq_enqueue
               fio-7066  [009]  1800.209870:
function:                               io_wqe_enqueue
               fio-7066  [009]  1800.209870:
function:                                  _raw_spin_lock_irqsave
               fio-7066  [009]  1800.209870: function:
_raw_spin_unlock_irqrestore



  From which I deduce that __io_file_supports_async() (today named
__io_file_supports_nowait) returns false, and therefore every io_uring
operation is bounced to a workqueue with the resulting great loss in
performance.


However, I also see NOWAIT is part of the default set of flags:


#define QUEUE_FLAG_MQ_DEFAULT   ((1 << QUEUE_FLAG_IO_STAT) |            \
                                   (1 << QUEUE_FLAG_SAME_COMP) |          \
                                   (1 << QUEUE_FLAG_NOWAIT))

and I don't see that md touches it (I do see that dm plays with it).


So, what's the story? does md not support NOWAIT? If so, that's a huge
blow to io_uring with md. If it does, are there any clues about why I
see requests bouncing to a workqueue?
That is indeed the story, dm supports it but md doesn't just yet.


Ah, so I missed md clearing the default flags somewhere.


This is a false negative from io_uring's point of view, yes? An md on nvme would be essentially nowait in normal operation, it just doesn't know it. aio on the same device would not block on the same workload.


It's
being worked on right now, though:

https://lore.kernel.org/linux-raid/20211101215143.1580-1-vverma@xxxxxxxxxxxxxxxx/

Should be pretty simple, and then we can push to -stable as well.


That's good to know.





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux