Running a trivial randread, direct=1 fio workload against a RAID-0
composed of some nvme devices, I see this pattern:
fio-7066 [009] 1800.209865: function: io_submit_sqes
fio-7066 [009] 1800.209866: function:
rcu_read_unlock_strict
fio-7066 [009] 1800.209866: function:
io_submit_sqe
fio-7066 [009] 1800.209866: function:
io_init_req
fio-7066 [009] 1800.209866:
function: io_file_get
fio-7066 [009] 1800.209866:
function: fget_many
fio-7066 [009] 1800.209866:
function: __fget_files
fio-7066 [009] 1800.209867:
function: rcu_read_unlock_strict
fio-7066 [009] 1800.209867: function:
io_req_prep
fio-7066 [009] 1800.209867:
function: io_prep_rw
fio-7066 [009] 1800.209867: function:
io_queue_sqe
fio-7066 [009] 1800.209867:
function: io_req_defer
fio-7066 [009] 1800.209867:
function: __io_queue_sqe
fio-7066 [009] 1800.209868:
function: io_issue_sqe
fio-7066 [009] 1800.209868:
function: io_read
fio-7066 [009] 1800.209868:
function: io_import_iovec
fio-7066 [009] 1800.209868:
function: __io_file_supports_async
fio-7066 [009] 1800.209868:
function: I_BDEV
fio-7066 [009] 1800.209868:
function: __kmalloc
fio-7066 [009] 1800.209868:
function: kmalloc_slab
fio-7066 [009] 1800.209868: function: __cond_resched
fio-7066 [009] 1800.209868: function:
rcu_all_qs
fio-7066 [009] 1800.209869: function: should_failslab
fio-7066 [009] 1800.209869:
function: io_req_map_rw
fio-7066 [009] 1800.209869:
function: io_arm_poll_handler
fio-7066 [009] 1800.209869:
function: io_queue_async_work
fio-7066 [009] 1800.209869:
function: io_prep_async_link
fio-7066 [009] 1800.209869:
function: io_prep_async_work
fio-7066 [009] 1800.209870:
function: io_wq_enqueue
fio-7066 [009] 1800.209870:
function: io_wqe_enqueue
fio-7066 [009] 1800.209870:
function: _raw_spin_lock_irqsave
fio-7066 [009] 1800.209870: function:
_raw_spin_unlock_irqrestore
From which I deduce that __io_file_supports_async() (today named
__io_file_supports_nowait) returns false, and therefore every io_uring
operation is bounced to a workqueue with the resulting great loss in
performance.
However, I also see NOWAIT is part of the default set of flags:
#define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \
(1 << QUEUE_FLAG_SAME_COMP) | \
(1 << QUEUE_FLAG_NOWAIT))
and I don't see that md touches it (I do see that dm plays with it).
So, what's the story? does md not support NOWAIT? If so, that's a huge
blow to io_uring with md. If it does, are there any clues about why I
see requests bouncing to a workqueue?