From: Damien Le Moal <damien.lemoal@xxxxxxx> When the cmdprio_percentage option is used, the specified percentage of IO will be issued with the highest priority class IOPRIO_CLASS_RT. This priority class maps to the ATA NCQ "high" priority level and allows exercising a SATA device to measure its command latency characteristics in the presence of low and high priority commands. Beside ATA NCQ commands, Linux block IO schedulers also support IO priorities and will behave differently in the presence of IOs with different IO priority classes and values. However, cmdprio_percentage does not allow specifying all possible priority classes and values. To solve this, introduce libaio and io_uring engine specific options cmdprio_class and cmdprio. These new options are the equivalent of the prioclass and prio options and allow specifying the priority class and priority value to use for asynchronous I/Os when the cmdprio_percentage option is used. If not specified, the I/O priority class defaults to IOPRIO_CLASS_RT and the I/O priority value to 0, as before. Similarly to the cmdprio_percentage option, these options can specify different values for read and write I/Os using a comma separated list. The manpage, HOWTO and fiograph configuration file are updated to document these new options. Signed-off-by: Damien Le Moal <damien.lemoal@xxxxxxx> Signed-off-by: Niklas Cassel <niklas.cassel@xxxxxxx> --- HOWTO | 26 +++++++++++++++++-- engines/cmdprio.h | 11 ++++++++- engines/io_uring.c | 48 ++++++++++++++++++++++++++++++++++-- engines/libaio.c | 48 ++++++++++++++++++++++++++++++++++-- fio.1 | 24 ++++++++++++++++-- tools/fiograph/fiograph.conf | 4 +-- 6 files changed, 150 insertions(+), 11 deletions(-) diff --git a/HOWTO b/HOWTO index 916f5191..8b7d4957 100644 --- a/HOWTO +++ b/HOWTO @@ -2172,6 +2172,26 @@ with the caveat that when used on the command line, they must come after the to be effective, NCQ priority must be supported and enabled, and `direct=1' option must be used. fio must also be run as the root user. +.. option:: cmdprio_class=int[,int] : [io_uring] [libaio] + + Set the I/O priority class to use for I/Os that must be issued with + a priority when :option:`cmdprio_percentage` is set. If not specified + when :option:`cmdprio_percentage` is set, this defaults to the highest + priority class. A single value applies to reads and writes. + Comma-separated values may be specified for reads and writes. See + :manpage:`ionice(1)`. See also the :option:`prioclass` option. + +.. option:: cmdprio=int[,int] : [io_uring] [libaio] + + Set the I/O priority value to use for I/Os that must be issued with + a priority when :option:`cmdprio_percentage` is set. If not specified + when :option:`cmdprio_percentage` is set, this defaults to 0. + Linux limits us to a positive value between 0 and 7, with 0 being the + highest. A single value applies to reads and writes. Comma-separated + values may be specified for reads and writes. See :manpage:`ionice(1)`. + Refer to an appropriate manpage for other operating systems since + meaning of priority may differ. See also the :option:`prio` option. + .. option:: fixedbufs : [io_uring] If fio is asked to do direct IO, then Linux will map pages for each @@ -2974,12 +2994,14 @@ Threads, processes and job synchronization between 0 and 7, with 0 being the highest. See man :manpage:`ionice(1)`. Refer to an appropriate manpage for other operating systems since meaning of priority may differ. For per-command priority - setting, see I/O engine specific `cmdprio_percentage` option. + setting, see I/O engine specific :option:`cmdprio_percentage` and + :option:`cmdprio` options. .. option:: prioclass=int Set the I/O priority class. See man :manpage:`ionice(1)`. For per-command - priority setting, see I/O engine specific `cmdprio_percentage` option. + priority setting, see I/O engine specific :option:`cmdprio_percentage` + and :option:`cmdprio_class` options. .. option:: cpus_allowed=str diff --git a/engines/cmdprio.h b/engines/cmdprio.h index 19120d78..e3b42182 100644 --- a/engines/cmdprio.h +++ b/engines/cmdprio.h @@ -10,6 +10,8 @@ struct cmdprio { unsigned int percentage[DDIR_RWDIR_CNT]; + unsigned int class[DDIR_RWDIR_CNT]; + unsigned int level[DDIR_RWDIR_CNT]; }; static int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio, @@ -19,9 +21,16 @@ static int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio, bool has_cmdprio_percentage = false; int i; + /* + * If cmdprio_percentage is set and cmdprio_class is not set, + * default to RT priority class. + */ for (i = 0; i < DDIR_RWDIR_CNT; i++) { - if (cmdprio->percentage[i]) + if (cmdprio->percentage[i]) { + if (!cmdprio->class[i]) + cmdprio->class[i] = IOPRIO_CLASS_RT; has_cmdprio_percentage = true; + } } /* diff --git a/engines/io_uring.c b/engines/io_uring.c index 1731eb24..1591ee4e 100644 --- a/engines/io_uring.c +++ b/engines/io_uring.c @@ -133,6 +133,36 @@ static struct fio_option options[] = { .category = FIO_OPT_C_ENGINE, .group = FIO_OPT_G_IOURING, }, + { + .name = "cmdprio_class", + .lname = "Asynchronous I/O priority class", + .type = FIO_OPT_INT, + .off1 = offsetof(struct ioring_options, + cmdprio.class[DDIR_READ]), + .off2 = offsetof(struct ioring_options, + cmdprio.class[DDIR_WRITE]), + .help = "Set asynchronous IO priority class", + .minval = IOPRIO_MIN_PRIO_CLASS + 1, + .maxval = IOPRIO_MAX_PRIO_CLASS, + .interval = 1, + .category = FIO_OPT_C_ENGINE, + .group = FIO_OPT_G_IOURING, + }, + { + .name = "cmdprio", + .lname = "Asynchronous I/O priority level", + .type = FIO_OPT_INT, + .off1 = offsetof(struct ioring_options, + cmdprio.level[DDIR_READ]), + .off2 = offsetof(struct ioring_options, + cmdprio.level[DDIR_WRITE]), + .help = "Set asynchronous IO priority level", + .minval = IOPRIO_MIN_PRIO, + .maxval = IOPRIO_MAX_PRIO, + .interval = 1, + .category = FIO_OPT_C_ENGINE, + .group = FIO_OPT_G_IOURING, + }, #else { .name = "cmdprio_percentage", @@ -140,6 +170,18 @@ static struct fio_option options[] = { .type = FIO_OPT_UNSUPPORTED, .help = "Your platform does not support I/O priority classes", }, + { + .name = "cmdprio_class", + .lname = "Asynchronous I/O priority class", + .type = FIO_OPT_UNSUPPORTED, + .help = "Your platform does not support I/O priority classes", + }, + { + .name = "cmdprio", + .lname = "Asynchronous I/O priority level", + .type = FIO_OPT_UNSUPPORTED, + .help = "Your platform does not support I/O priority classes", + }, #endif { .name = "fixedbufs", @@ -389,10 +431,12 @@ static void fio_ioring_prio_prep(struct thread_data *td, struct io_u *io_u) struct ioring_data *ld = td->io_ops_data; struct io_uring_sqe *sqe = &ld->sqes[io_u->index]; struct cmdprio *cmdprio = &o->cmdprio; - unsigned int p = cmdprio->percentage[io_u->ddir]; + enum fio_ddir ddir = io_u->ddir; + unsigned int p = cmdprio->percentage[ddir]; if (p && rand_between(&td->prio_state, 0, 99) < p) { - sqe->ioprio = ioprio_value(IOPRIO_CLASS_RT, 0); + sqe->ioprio = + ioprio_value(cmdprio->class[ddir], cmdprio->level[ddir]); io_u->flags |= IO_U_F_PRIORITY; } else { sqe->ioprio = 0; diff --git a/engines/libaio.c b/engines/libaio.c index 8cf560c5..8b965fe2 100644 --- a/engines/libaio.c +++ b/engines/libaio.c @@ -87,6 +87,36 @@ static struct fio_option options[] = { .category = FIO_OPT_C_ENGINE, .group = FIO_OPT_G_LIBAIO, }, + { + .name = "cmdprio_class", + .lname = "Asynchronous I/O priority class", + .type = FIO_OPT_INT, + .off1 = offsetof(struct libaio_options, + cmdprio.class[DDIR_READ]), + .off2 = offsetof(struct libaio_options, + cmdprio.class[DDIR_WRITE]), + .help = "Set asynchronous IO priority class", + .minval = IOPRIO_MIN_PRIO_CLASS + 1, + .maxval = IOPRIO_MAX_PRIO_CLASS, + .interval = 1, + .category = FIO_OPT_C_ENGINE, + .group = FIO_OPT_G_LIBAIO, + }, + { + .name = "cmdprio", + .lname = "Asynchronous I/O priority level", + .type = FIO_OPT_INT, + .off1 = offsetof(struct libaio_options, + cmdprio.level[DDIR_READ]), + .off2 = offsetof(struct libaio_options, + cmdprio.level[DDIR_WRITE]), + .help = "Set asynchronous IO priority level", + .minval = IOPRIO_MIN_PRIO, + .maxval = IOPRIO_MAX_PRIO, + .interval = 1, + .category = FIO_OPT_C_ENGINE, + .group = FIO_OPT_G_LIBAIO, + }, #else { .name = "cmdprio_percentage", @@ -94,6 +124,18 @@ static struct fio_option options[] = { .type = FIO_OPT_UNSUPPORTED, .help = "Your platform does not support I/O priority classes", }, + { + .name = "cmdprio_class", + .lname = "Asynchronous I/O priority class", + .type = FIO_OPT_UNSUPPORTED, + .help = "Your platform does not support I/O priority classes", + }, + { + .name = "cmdprio", + .lname = "Asynchronous I/O priority level", + .type = FIO_OPT_UNSUPPORTED, + .help = "Your platform does not support I/O priority classes", + }, #endif { .name = "nowait", @@ -142,10 +184,12 @@ static void fio_libaio_prio_prep(struct thread_data *td, struct io_u *io_u) { struct libaio_options *o = td->eo; struct cmdprio *cmdprio = &o->cmdprio; - unsigned int p = cmdprio->percentage[io_u->ddir]; + enum fio_ddir ddir = io_u->ddir; + unsigned int p = cmdprio->percentage[ddir]; if (p && rand_between(&td->prio_state, 0, 99) < p) { - io_u->iocb.aio_reqprio = ioprio_value(IOPRIO_CLASS_RT, 0); + io_u->iocb.aio_reqprio = + ioprio_value(cmdprio->class[ddir], cmdprio->level[ddir]); io_u->iocb.u.c.flags |= IOCB_FLAG_IOPRIO; io_u->flags |= IO_U_F_PRIORITY; } diff --git a/fio.1 b/fio.1 index 3611da98..09b97de3 100644 --- a/fio.1 +++ b/fio.1 @@ -1970,6 +1970,24 @@ with the `prio` or `prioclass` options. For this option to be effective, NCQ priority must be supported and enabled, and `direct=1' option must be used. fio must also be run as the root user. .TP +.BI (io_uring,libaio)cmdprio_class \fR=\fPint[,int] +Set the I/O priority class to use for I/Os that must be issued with a +priority when \fBcmdprio_percentage\fR is set. If not specified when +\fBcmdprio_percentage\fR is set, this defaults to the highest priority +class. A single value applies to reads and writes. Comma-separated +values may be specified for reads and writes. See man \fBionice\fR\|(1). +See also the \fBprioclass\fR option. +.TP +.BI (io_uring,libaio)cmdprio \fR=\fPint[,int] +Set the I/O priority value to use for I/Os that must be issued with a +priority when \fBcmdprio_percentage\fR is set. If not specified when +\fBcmdprio_percentage\fR is set, this defaults to 0. Linux limits us to +a positive value between 0 and 7, with 0 being the highest. A single +value applies to reads and writes. Comma-separated values may be specified +for reads and writes. See man \fBionice\fR\|(1). Refer to an appropriate +manpage for other operating systems since the meaning of priority may differ. +See also the \fBprio\fR option. +.TP .BI (io_uring)fixedbufs If fio is asked to do direct IO, then Linux will map pages for each IO call, and release them when IO is done. If this option is set, the pages are pre-mapped @@ -2693,11 +2711,13 @@ Set the I/O priority value of this job. Linux limits us to a positive value between 0 and 7, with 0 being the highest. See man \fBionice\fR\|(1). Refer to an appropriate manpage for other operating systems since meaning of priority may differ. For per-command priority -setting, see the I/O engine specific `cmdprio_percentage` option. +setting, see the I/O engine specific `cmdprio_percentage` and +`cmdprio` options. .TP .BI prioclass \fR=\fPint Set the I/O priority class. See man \fBionice\fR\|(1). For per-command -priority setting, see the I/O engine specific `cmdprio_percentage` option. +priority setting, see the I/O engine specific `cmdprio_percentage` and +`cmdprio_class` options. .TP .BI cpus_allowed \fR=\fPstr Controls the same options as \fBcpumask\fR, but accepts a textual diff --git a/tools/fiograph/fiograph.conf b/tools/fiograph/fiograph.conf index 1957e11d..5ba59c52 100644 --- a/tools/fiograph/fiograph.conf +++ b/tools/fiograph/fiograph.conf @@ -51,10 +51,10 @@ specific_options=https http_host http_user http_pass http_s3_key http_s3_ke specific_options=ime_psync ime_psyncv [ioengine_io_uring] -specific_options=hipri cmdprio_percentage fixedbufs registerfiles sqthread_poll sqthread_poll_cpu nonvectored uncached nowait force_async +specific_options=hipri cmdprio_percentage cmdprio_class cmdprio fixedbufs registerfiles sqthread_poll sqthread_poll_cpu nonvectored uncached nowait force_async [ioengine_libaio] -specific_options=userspace_reap cmdprio_percentage nowait +specific_options=userspace_reap cmdprio_percentage cmdprio_class cmdprio nowait [ioengine_libcufile] specific_options=gpu_dev_ids cuda_io -- 2.31.1