[PATCH v2 08/11] libaio,io_uring: introduce cmdprio_bssplit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Damien Le Moal <damien.lemoal@xxxxxxx>

The cmdprio_percentage, cmdprio_class and cmdprio options allow
specifying different values for read and write operations. This enables
various IO priority issuing patterns even uner a mixed read-write
workload but does not allow differentiation within read and write
I/O operation types with different sizes when the bssplit option is
used.

Introduce the cmdprio_bssplit option to complement the use of the
bssplit option.  This new option has the same format as the bssplit
option, but the percentage values indicate the percentage of I/O
operations with a particular block size that must be issued with the
priority class and value specified by cmdprio_class and cmdprio.

Signed-off-by: Damien Le Moal <damien.lemoal@xxxxxxx>
Signed-off-by: Niklas Cassel <niklas.cassel@xxxxxxx>
---
 HOWTO                        |  29 ++++++---
 engines/cmdprio.h            | 113 ++++++++++++++++++++++++++++++++++-
 engines/io_uring.c           |  29 ++++++++-
 engines/libaio.c             |  29 ++++++++-
 fio.1                        |  34 +++++++----
 tools/fiograph/fiograph.conf |   4 +-
 6 files changed, 210 insertions(+), 28 deletions(-)

diff --git a/HOWTO b/HOWTO
index 8b7d4957..1853f56a 100644
--- a/HOWTO
+++ b/HOWTO
@@ -2175,23 +2175,38 @@ with the caveat that when used on the command line, they must come after the
 .. option:: cmdprio_class=int[,int] : [io_uring] [libaio]
 
 	Set the I/O priority class to use for I/Os that must be issued with
-	a priority when :option:`cmdprio_percentage` is set. If not specified
-	when :option:`cmdprio_percentage` is set, this defaults to the highest
-	priority class. A single value applies to reads and writes.
-	Comma-separated values may be specified for reads and writes. See
-	:manpage:`ionice(1)`. See also the :option:`prioclass` option.
+	a priority when :option:`cmdprio_percentage` or
+	:option:`cmdprio_bssplit` is set. If not specified when
+	:option:`cmdprio_percentage` or :option:`cmdprio_bssplit` is set,
+	this defaults to the highest priority class. A single value applies
+	to reads and writes. Comma-separated values may be specified for
+	reads and writes. See :manpage:`ionice(1)`. See also the
+	:option:`prioclass` option.
 
 .. option:: cmdprio=int[,int] : [io_uring] [libaio]
 
 	Set the I/O priority value to use for I/Os that must be issued with
-	a priority when :option:`cmdprio_percentage` is set. If not specified
-	when :option:`cmdprio_percentage` is set, this defaults to 0.
+	a priority when :option:`cmdprio_percentage` or
+	:option:`cmdprio_bssplit` is set. If not specified when
+	:option:`cmdprio_percentage` or :option:`cmdprio_bssplit` is set,
+	this defaults to 0.
 	Linux limits us to a positive value between 0 and 7, with 0 being the
 	highest. A single value applies to reads and writes. Comma-separated
 	values may be specified for reads and writes. See :manpage:`ionice(1)`.
 	Refer to an appropriate manpage for other operating systems since
 	meaning of priority may differ. See also the :option:`prio` option.
 
+.. option:: cmdprio_bssplit=str[,str] : [io_uring] [libaio]
+	To get a finer control over I/O priority, this option allows
+	specifying the percentage of IOs that must have a priority set
+	depending on the block size of the IO. This option is useful only
+	when used together with the :option:`bssplit` option, that is,
+	multiple different block sizes are used for reads and writes.
+	The format for this option is the same as the format of the
+	:option:`bssplit` option, with the exception that values for
+	trim IOs are ignored. This option is mutually exclusive with the
+	:option:`cmdprio_percentage` option.
+
 .. option:: fixedbufs : [io_uring]
 
     If fio is asked to do direct IO, then Linux will map pages for each
diff --git a/engines/cmdprio.h b/engines/cmdprio.h
index e3b42182..8acdb0b3 100644
--- a/engines/cmdprio.h
+++ b/engines/cmdprio.h
@@ -12,18 +12,106 @@ struct cmdprio {
 	unsigned int percentage[DDIR_RWDIR_CNT];
 	unsigned int class[DDIR_RWDIR_CNT];
 	unsigned int level[DDIR_RWDIR_CNT];
+	unsigned int bssplit_nr[DDIR_RWDIR_CNT];
+	struct bssplit *bssplit[DDIR_RWDIR_CNT];
 };
 
+static int fio_cmdprio_bssplit_ddir(struct thread_options *to, void *cb_arg,
+				    enum fio_ddir ddir, char *str, bool data)
+{
+	struct cmdprio *cmdprio = cb_arg;
+	struct split split;
+	unsigned int i;
+
+	if (ddir == DDIR_TRIM)
+		return 0;
+
+	memset(&split, 0, sizeof(split));
+
+	if (split_parse_ddir(to, &split, str, data, BSSPLIT_MAX))
+		return 1;
+	if (!split.nr)
+		return 0;
+
+	cmdprio->bssplit_nr[ddir] = split.nr;
+	cmdprio->bssplit[ddir] = malloc(split.nr * sizeof(struct bssplit));
+	if (!cmdprio->bssplit[ddir])
+		return 1;
+
+	for (i = 0; i < split.nr; i++) {
+		cmdprio->bssplit[ddir][i].bs = split.val1[i];
+		if (split.val2[i] == -1U) {
+			cmdprio->bssplit[ddir][i].perc = 0;
+		} else {
+			if (split.val2[i] > 100)
+				cmdprio->bssplit[ddir][i].perc = 100;
+			else
+				cmdprio->bssplit[ddir][i].perc = split.val2[i];
+		}
+	}
+
+	return 0;
+}
+
+static int fio_cmdprio_bssplit_parse(struct thread_data *td, const char *input,
+				     struct cmdprio *cmdprio)
+{
+	char *str, *p;
+	int i, ret = 0;
+
+	p = str = strdup(input);
+
+	strip_blank_front(&str);
+	strip_blank_end(str);
+
+	ret = str_split_parse(td, str, fio_cmdprio_bssplit_ddir, cmdprio, false);
+
+	if (parse_dryrun()) {
+		for (i = 0; i < DDIR_RWDIR_CNT; i++) {
+			free(cmdprio->bssplit[i]);
+			cmdprio->bssplit[i] = NULL;
+			cmdprio->bssplit_nr[i] = 0;
+		}
+	}
+
+	free(p);
+	return ret;
+}
+
+static inline int fio_cmdprio_percentage(struct cmdprio *cmdprio,
+					 struct io_u *io_u)
+{
+	enum fio_ddir ddir = io_u->ddir;
+	unsigned int p = cmdprio->percentage[ddir];
+	int i;
+
+	/*
+	 * If cmdprio_percentage option was specified, then use that
+	 * percentage. Otherwise, use cmdprio_bssplit percentages depending
+	 * on the IO size.
+	 */
+	if (p)
+		return p;
+
+	for (i = 0; i < cmdprio->bssplit_nr[ddir]; i++) {
+		if (cmdprio->bssplit[ddir][i].bs == io_u->buflen)
+			return cmdprio->bssplit[ddir][i].perc;
+	}
+
+	return 0;
+}
+
 static int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio,
 			    bool *has_cmdprio)
 {
 	struct thread_options *to = &td->o;
 	bool has_cmdprio_percentage = false;
+	bool has_cmdprio_bssplit = false;
 	int i;
 
 	/*
-	 * If cmdprio_percentage is set and cmdprio_class is not set,
-	 * default to RT priority class.
+	 * If cmdprio_percentage/cmdprio_bssplit is set and cmdprio_class
+	 * is not set, default to RT priority class.
 	 */
 	for (i = 0; i < DDIR_RWDIR_CNT; i++) {
 		if (cmdprio->percentage[i]) {
@@ -31,6 +119,11 @@ static int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio,
 				cmdprio->class[i] = IOPRIO_CLASS_RT;
 			has_cmdprio_percentage = true;
 		}
+		if (cmdprio->bssplit_nr[i]) {
+			if (!cmdprio->class[i])
+				cmdprio->class[i] = IOPRIO_CLASS_RT;
+			has_cmdprio_bssplit = true;
+		}
 	}
 
 	/*
@@ -44,8 +137,22 @@ static int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio,
 			to->name);
 		return 1;
 	}
+	if (has_cmdprio_bssplit &&
+	    (fio_option_is_set(to, ioprio) ||
+	     fio_option_is_set(to, ioprio_class))) {
+		log_err("%s: cmdprio_bssplit option and mutually exclusive "
+			"prio or prioclass option is set, exiting\n",
+			to->name);
+		return 1;
+	}
+	if (has_cmdprio_percentage && has_cmdprio_bssplit) {
+		log_err("%s: cmdprio_percentage and cmdprio_bssplit options "
+			"are mutually exclusive\n",
+			to->name);
+		return 1;
+	}
 
-	*has_cmdprio = has_cmdprio_percentage;
+	*has_cmdprio = has_cmdprio_percentage || has_cmdprio_bssplit;
 
 	return 0;
 }
diff --git a/engines/io_uring.c b/engines/io_uring.c
index 1591ee4e..57124d22 100644
--- a/engines/io_uring.c
+++ b/engines/io_uring.c
@@ -75,7 +75,7 @@ struct ioring_data {
 };
 
 struct ioring_options {
-	void *pad;
+	struct thread_data *td;
 	unsigned int hipri;
 	struct cmdprio cmdprio;
 	unsigned int fixedbufs;
@@ -108,6 +108,15 @@ static int fio_ioring_sqpoll_cb(void *data, unsigned long long *val)
 	return 0;
 }
 
+static int str_cmdprio_bssplit_cb(void *data, const char *input)
+{
+	struct ioring_options *o = data;
+	struct thread_data *td = o->td;
+	struct cmdprio *cmdprio = &o->cmdprio;
+
+	return fio_cmdprio_bssplit_parse(td, input, cmdprio);
+}
+
 static struct fio_option options[] = {
 	{
 		.name	= "hipri",
@@ -163,6 +172,16 @@ static struct fio_option options[] = {
 		.category = FIO_OPT_C_ENGINE,
 		.group	= FIO_OPT_G_IOURING,
 	},
+	{
+		.name   = "cmdprio_bssplit",
+		.lname  = "Priority percentage block size split",
+		.type   = FIO_OPT_STR_ULL,
+		.cb     = str_cmdprio_bssplit_cb,
+		.off1   = offsetof(struct ioring_options, cmdprio.bssplit),
+		.help   = "Set priority percentages for different block sizes",
+		.category = FIO_OPT_C_ENGINE,
+		.group	= FIO_OPT_G_IOURING,
+	},
 #else
 	{
 		.name	= "cmdprio_percentage",
@@ -182,6 +201,12 @@ static struct fio_option options[] = {
 		.type	= FIO_OPT_UNSUPPORTED,
 		.help	= "Your platform does not support I/O priority classes",
 	},
+	{
+		.name   = "cmdprio_bssplit",
+		.lname  = "Priority percentage block size split",
+		.type	= FIO_OPT_UNSUPPORTED,
+		.help	= "Your platform does not support I/O priority classes",
+	},
 #endif
 	{
 		.name	= "fixedbufs",
@@ -432,7 +457,7 @@ static void fio_ioring_prio_prep(struct thread_data *td, struct io_u *io_u)
 	struct io_uring_sqe *sqe = &ld->sqes[io_u->index];
 	struct cmdprio *cmdprio = &o->cmdprio;
 	enum fio_ddir ddir = io_u->ddir;
-	unsigned int p = cmdprio->percentage[ddir];
+	unsigned int p = fio_cmdprio_percentage(cmdprio, io_u);
 
 	if (p && rand_between(&td->prio_state, 0, 99) < p) {
 		sqe->ioprio =
diff --git a/engines/libaio.c b/engines/libaio.c
index 8b965fe2..9fba3b12 100644
--- a/engines/libaio.c
+++ b/engines/libaio.c
@@ -56,12 +56,21 @@ struct libaio_data {
 };
 
 struct libaio_options {
-	void *pad;
+	struct thread_data *td;
 	unsigned int userspace_reap;
 	struct cmdprio cmdprio;
 	unsigned int nowait;
 };
 
+static int str_cmdprio_bssplit_cb(void *data, const char *input)
+{
+	struct libaio_options *o = data;
+	struct thread_data *td = o->td;
+	struct cmdprio *cmdprio = &o->cmdprio;
+
+	return fio_cmdprio_bssplit_parse(td, input, cmdprio);
+}
+
 static struct fio_option options[] = {
 	{
 		.name	= "userspace_reap",
@@ -117,6 +126,16 @@ static struct fio_option options[] = {
 		.category = FIO_OPT_C_ENGINE,
 		.group	= FIO_OPT_G_LIBAIO,
 	},
+	{
+		.name   = "cmdprio_bssplit",
+		.lname  = "Priority percentage block size split",
+		.type   = FIO_OPT_STR_ULL,
+		.cb     = str_cmdprio_bssplit_cb,
+		.off1   = offsetof(struct libaio_options, cmdprio.bssplit),
+		.help   = "Set priority percentages for different block sizes",
+		.category = FIO_OPT_C_ENGINE,
+		.group	= FIO_OPT_G_LIBAIO,
+	},
 #else
 	{
 		.name	= "cmdprio_percentage",
@@ -136,6 +155,12 @@ static struct fio_option options[] = {
 		.type	= FIO_OPT_UNSUPPORTED,
 		.help	= "Your platform does not support I/O priority classes",
 	},
+	{
+		.name   = "cmdprio_bssplit",
+		.lname  = "Priority percentage block size split",
+		.type	= FIO_OPT_UNSUPPORTED,
+		.help	= "Your platform does not support I/O priority classes",
+	},
 #endif
 	{
 		.name	= "nowait",
@@ -185,7 +210,7 @@ static void fio_libaio_prio_prep(struct thread_data *td, struct io_u *io_u)
 	struct libaio_options *o = td->eo;
 	struct cmdprio *cmdprio = &o->cmdprio;
 	enum fio_ddir ddir = io_u->ddir;
-	unsigned int p = cmdprio->percentage[ddir];
+	unsigned int p = fio_cmdprio_percentage(cmdprio, io_u);
 
 	if (p && rand_between(&td->prio_state, 0, 99) < p) {
 		io_u->iocb.aio_reqprio =
diff --git a/fio.1 b/fio.1
index 09b97de3..415a91bb 100644
--- a/fio.1
+++ b/fio.1
@@ -1972,21 +1972,31 @@ used. fio must also be run as the root user.
 .TP
 .BI (io_uring,libaio)cmdprio_class \fR=\fPint[,int]
 Set the I/O priority class to use for I/Os that must be issued with a
-priority when \fBcmdprio_percentage\fR is set. If not specified when
-\fBcmdprio_percentage\fR is set, this defaults to the highest priority
-class. A single value applies to reads and writes. Comma-separated
-values may be specified for reads and writes. See man \fBionice\fR\|(1).
-See also the \fBprioclass\fR option.
+priority when \fBcmdprio_percentage\fR or \fBcmdprio_bssplit\fR is set.
+If not specified when \fBcmdprio_percentage\fR or \fBcmdprio_bssplit\fR
+is set, this defaults to the highest priority class. A single value applies
+to reads and writes. Comma-separated values may be specified for reads and
+writes. See man \fBionice\fR\|(1). See also the \fBprioclass\fR option.
 .TP
 .BI (io_uring,libaio)cmdprio \fR=\fPint[,int]
 Set the I/O priority value to use for I/Os that must be issued with a
-priority when \fBcmdprio_percentage\fR is set. If not specified when
-\fBcmdprio_percentage\fR is set, this defaults to 0. Linux limits us to
-a positive value between 0 and 7, with 0 being the highest. A single
-value applies to reads and writes. Comma-separated values may be specified
-for reads and writes. See man \fBionice\fR\|(1). Refer to an appropriate
-manpage for other operating systems since the meaning of priority may differ.
-See also the \fBprio\fR option.
+priority when \fBcmdprio_percentage\fR or \fBcmdprio_bssplit\fR is set.
+If not specified when \fBcmdprio_percentage\fR or \fBcmdprio_bssplit\fR
+is set, this defaults to 0. Linux limits us to a positive value between
+0 and 7, with 0 being the highest. A single value applies to reads and writes.
+Comma-separated values may be specified for reads and writes. See man
+\fBionice\fR\|(1). Refer to an appropriate manpage for other operating systems
+since the meaning of priority may differ. See also the \fBprio\fR option.
+.TP
+.BI (io_uring,libaio)cmdprio_bssplit \fR=\fPstr[,str]
+To get a finer control over I/O priority, this option allows specifying
+the percentage of IOs that must have a priority set depending on the block
+size of the IO. This option is useful only when used together with the option
+\fBbssplit\fR, that is, multiple different block sizes are used for reads and
+writes. The format for this option is the same as the format of the
+\fBbssplit\fR option, with the exception that values for trim IOs are
+ignored. This option is mutually exclusive with the \fBcmdprio_percentage\fR
+option.
 .TP
 .BI (io_uring)fixedbufs
 If fio is asked to do direct IO, then Linux will map pages for each IO call, and
diff --git a/tools/fiograph/fiograph.conf b/tools/fiograph/fiograph.conf
index 5ba59c52..cfd2fd8e 100644
--- a/tools/fiograph/fiograph.conf
+++ b/tools/fiograph/fiograph.conf
@@ -51,10 +51,10 @@ specific_options=https  http_host  http_user  http_pass  http_s3_key  http_s3_ke
 specific_options=ime_psync  ime_psyncv
 
 [ioengine_io_uring]
-specific_options=hipri  cmdprio_percentage  cmdprio_class  cmdprio  fixedbufs  registerfiles  sqthread_poll  sqthread_poll_cpu  nonvectored  uncached  nowait  force_async
+specific_options=hipri  cmdprio_percentage  cmdprio_class  cmdprio  cmdprio_bssplit  fixedbufs  registerfiles  sqthread_poll  sqthread_poll_cpu  nonvectored  uncached  nowait  force_async
 
 [ioengine_libaio]
-specific_options=userspace_reap  cmdprio_percentage  cmdprio_class  cmdprio  nowait
+specific_options=userspace_reap  cmdprio_percentage  cmdprio_class  cmdprio  cmdprio_bssplit  nowait
 
 [ioengine_libcufile]
 specific_options=gpu_dev_ids  cuda_io
-- 
2.31.1




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux