Re: [PATCH liburing v1] Add io_uring_iowait_toggle()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/16/24 4:40 PM, David Wei wrote:
> diff --git a/man/io_uring_enter.2 b/man/io_uring_enter.2
> index 5e4121b..c06050a 100644
> --- a/man/io_uring_enter.2
> +++ b/man/io_uring_enter.2
> @@ -104,6 +104,11 @@ If the ring file descriptor has been registered through use of
>  then setting this flag will tell the kernel that the
>  .I ring_fd
>  passed in is the registered ring offset rather than a normal file descriptor.
> +.TP
> +.B IORING_ENTER_NO_IOWAIT
> +If this flag is set, then in_iowait will not be set for the current task if
> +.BR io_uring_enter (2)
> +results in waiting.

I'd probably would say something ala:

If this flag is set, then waiting on events will not be accounted as
iowait for the task if
.BR io_uring_enter (2)
results in waiting.

or something like that. io_iowait is an in-kernel thing, iowait is what
application writers will know about.

> +.B #include <liburing.h>
> +.PP
> +.BI "int io_uring_iowait_toggle(struct io_uring *" ring ",
> +.BI "                            bool " enabled ");"
> +.BI "
> +.fi
> +.SH DESCRIPTION
> +.PP
> +The
> +.BR io_uring_iowait_toggle (3)
> +function toggles for a given
> +.I ring
> +whether in_iowait is set for the current task while waiting for completions.
> +When in_iowait is set, time spent waiting is accounted as iowait time;
> +otherwise, it is accounted as idle time. The default behavior is to always set
> +in_iowait to true.

And ditto here

> +Setting in_iowait achieves two things:
> +
> +1. Account time spent waiting as iowait time
> +
> +2. Enable cpufreq optimisations, setting SCHED_CPUFREQ_IOWAIT on the rq

This should probably be something ala:

Setting in_iowait achieves two things:
.TP
.B Account time spent waiting as iowait time
.TP
.B Enable cpufreq optimisations, setting SCHED_CPUFREQ_IOWAIT on the rq
.PP

to make that format like a man page.

> +Some user tooling attributes iowait time as CPU utilization time, so high
> +iowait time can look like apparent high CPU utilization, even though the task
> +is not scheduled and the CPU is free to run other tasks.  This function
> +provides a way to disable this behavior where it makes sense to do so.

And here. Since this is the main man page, maybe also add something
about how iowait is a relic from the old days of only having one CPU,
and it indicates that the task is block uninterruptibly waiting for IO
and hence cannot do other things. These days it's mostly a bogus
accounting value, but it does help with the cpufreq boost for certain
high frequency waits. Rephrase as needed :-)

> diff --git a/src/queue.c b/src/queue.c
> index c436061..bd2e6af 100644
> --- a/src/queue.c
> +++ b/src/queue.c
> @@ -110,6 +110,8 @@ static int _io_uring_get_cqe(struct io_uring *ring,
>  
>  		if (ring->int_flags & INT_FLAG_REG_RING)
>  			flags |= IORING_ENTER_REGISTERED_RING;
> +		if (ring->int_flags & INT_FLAG_NO_IOWAIT)
> +			flags |= IORING_ENTER_NO_IOWAIT;
>  		ret = __sys_io_uring_enter2(ring->enter_ring_fd, data->submit,
>  					    data->wait_nr, flags, data->arg,
>  					    data->sz);

Not strictly related, and we can always do that after, but now we have
two branches here. Since the INT flags are purely internal, we can
renumber them so that INT_FLAG_REG_RING matches
IORING_ENTER_REGISTERED_RING and INT_FLAG_NO_IOWAIT matches
IORING_ENTER_NO_IOWAIT. With that, you could kill the above two branches
and simply do:

	flags |= (ring->int_flags & INT_TO_ENTER_MASK);

which I think would be a nice thing to do.

Rest looks good, no further comments.

-- 
Jens Axboe





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux