On 6/21/22 01:11, Jan Kara wrote: > ioprio_get(2) can be asked to return the best IO priority from several > tasks (IOPRIO_WHO_PGRP, IOPRIO_WHO_USER). Currently the call treats > tasks without set IO priority as having priority > IOPRIO_CLASS_BE/IOPRIO_BE_NORM however this does not really reflect the > IO priority the task will get (which depends on task's nice value). > > Fix the code to use the real IO priority task's IO will use. For this we > do some factoring out to share the code converting task's CPU priority > to IO priority and we also have to modify code for > ioprio_get(IOPRIO_WHO_PROCESS) to keep returning IOPRIO_CLASS_NONE > priority for tasks without set IO priority as a special case to maintain > userspace visible API. > > Signed-off-by: Jan Kara <jack@xxxxxxx> > --- > block/ioprio.c | 49 ++++++++++++++++++++++++++++++++++++------ > include/linux/ioprio.h | 19 +++------------- > 2 files changed, 45 insertions(+), 23 deletions(-) > > diff --git a/block/ioprio.c b/block/ioprio.c > index b5cf7339709b..a4c19ce0de4c 100644 > --- a/block/ioprio.c > +++ b/block/ioprio.c > @@ -138,6 +138,27 @@ SYSCALL_DEFINE3(ioprio_set, int, which, int, who, int, ioprio) > return ret; > } > > +/* > + * If the task has set an I/O priority, use that. Otherwise, return > + * the default I/O priority. > + */ > +int __get_task_ioprio(struct task_struct *p) > +{ > + struct io_context *ioc = p->io_context; > + int prio; > + > + if (ioc) > + prio = ioc->ioprio; > + else > + prio = IOPRIO_DEFAULT; > + > + if (IOPRIO_PRIO_CLASS(prio) == IOPRIO_CLASS_NONE) > + prio = IOPRIO_PRIO_VALUE(task_nice_ioclass(p), > + task_nice_ioprio(p)); > + return prio; > +} > +EXPORT_SYMBOL_GPL(__get_task_ioprio); > + > static int get_task_ioprio(struct task_struct *p) > { > int ret; > @@ -145,10 +166,29 @@ static int get_task_ioprio(struct task_struct *p) > ret = security_task_getioprio(p); > if (ret) > goto out; > - ret = IOPRIO_DEFAULT; > + task_lock(p); > + ret = __get_task_ioprio(p); > + task_unlock(p); > +out: > + return ret; > +} > + > +/* > + * Return raw IO priority value as set by userspace. We use this for > + * ioprio_get(pid, IOPRIO_WHO_PROCESS) so that we keep historical behavior and > + * also so that userspace can distinguish unset IO priority (which just gets > + * overriden based on task's nice value) from IO priority set to some value. > + */ > +static int get_task_raw_ioprio(struct task_struct *p) { int ret; The "int ret;" declaration is on the wrong line, so is the curly bracket. > + > + ret = security_task_getioprio(p); > + if (ret) > + goto out; > task_lock(p); > if (p->io_context) > ret = p->io_context->ioprio; > + else > + ret = IOPRIO_DEFAULT; > task_unlock(p); > out: > return ret; > @@ -156,11 +196,6 @@ static int get_task_ioprio(struct task_struct *p) > > static int ioprio_best(unsigned short aprio, unsigned short bprio) > { > - if (!ioprio_valid(aprio)) > - aprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_BE_NORM); > - if (!ioprio_valid(bprio)) > - bprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_BE_NORM); > - > return min(aprio, bprio); > } This function could be declared as inline now... > > @@ -181,7 +216,7 @@ SYSCALL_DEFINE2(ioprio_get, int, which, int, who) > else > p = find_task_by_vpid(who); > if (p) > - ret = get_task_ioprio(p); > + ret = get_task_raw_ioprio(p); > break; > case IOPRIO_WHO_PGRP: > if (!who) > diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h > index 519d51fc8902..24e648dc4fb3 100644 > --- a/include/linux/ioprio.h > +++ b/include/linux/ioprio.h > @@ -46,24 +46,11 @@ static inline int task_nice_ioclass(struct task_struct *task) > return IOPRIO_CLASS_BE; > } > > -/* > - * If the calling process has set an I/O priority, use that. Otherwise, return > - * the default I/O priority. > - */ > +int __get_task_ioprio(struct task_struct *p); > + > static inline int get_current_ioprio(void) > { > - struct io_context *ioc = current->io_context; > - int prio; > - > - if (ioc) > - prio = ioc->ioprio; > - else > - prio = IOPRIO_DEFAULT; > - > - if (IOPRIO_PRIO_CLASS(prio) == IOPRIO_CLASS_NONE) > - prio = IOPRIO_PRIO_VALUE(task_nice_ioclass(current), > - task_nice_ioprio(current)); > - return prio; > + return __get_task_ioprio(current); The build bot complained about this one, but I do not understand why. Could it be because you do not have declared __get_task_ioprio() as "extern" ? Also, to reduce refactoring changes in this patch, you could introduce __get_task_ioprio() and make the above change in patch 2. No ? > } > > extern int set_task_ioprio(struct task_struct *task, int ioprio); -- Damien Le Moal Western Digital Research