Re: [PATCH 3/3] block: fix default IO priority handling again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 02-06-22 10:53:29, Damien Le Moal wrote:
> On 2022/06/02 1:04, Jan Kara wrote:
> > On Thu 02-06-22 00:08:28, Damien Le Moal wrote:
> >> The different ioprio leading to no BIO merging is definitely a problem
> >> but this patch is not really fixing anything in my opinion. It simply
> >> gets back to the previous "all 0s" ioprio initialization, which do not
> >> show the places where we actually have missing ioprio initialization to
> >> IOPRIO_DEFAULT.
> > 
> > So I agree we should settle on how to treat IOs with unset IO priority. The
> > behavior to use task's CPU priority when IO priority is unset is there for
> > a *long* time and so I think we should preserve that. The question is where
> > in the stack should the switch from "unset" value to "effective ioprio"
> > happen. Switching in IO context is IMO too early since userspace needs to
> > be able to differentiate "unset" from "set to
> > IOPRIO_CLASS_BE,IOPRIO_BE_NORM". But we could have a helper like
> > current_effective_ioprio() that will do the magic with mangling unset IO
> > priority based on task's CPU priority.
> 
> I agree that the task's CPU priority is a more sensible default. However,
> I do not understand your point about the IO context being too early to
> set the effective priority. If we do not do that, getting the issuer CPU
> priority will not be easily possible, right ?

I just meant that in the IO context we need to keep the information whether
IO priority is set to a particular value, or whether it is set to 0 meaning
"inherit from CPU priority". So we cannot just store the effective IO
priority immediately in the IO context instead of 0.

> > The fact is that bio->bi_ioprio gets set to anything only for direct IO in
> > iomap_dio_rw(). The rest of the IO has priority unset (BFQ fetches the
> > priority from task's IO context and ignores priority on bios BTW). And the
> > only place where req->ioprio (inherited from bio->bi_ioprio) gets used is
> > in a few drivers to set HIGHPRI flag for IOPRIO_CLASS_RT IO and then the
> > relatively new code in mq-deadline.
> 
> Yes, only IOPRIO_CLASS_RT has an effect at the hardware level right now.
> Everything else is only for the IO scheduler to play with. I am preparing a
> patch series to support scsi/ata command duration limits though. That adds a new
> IOPRIO_CLASS_DL and that one will also have an effect on the hardware.
> 
> Note that if a process/thread use ioprio_set(), we should also honor the ioprio
> set for buffered reads, at least those page IOs that are issued directly from
> the IO issuer context (which may include readahead... hmmm).

Yes, that would make sense as well.

> >> Isn't it simply that IOPRIO_DEFAULT should be set as the default for any bio
> >> being allocated (in bio_alloc ?) before it is setup and inherits the user io
> >> priority ? Otherwise, the bio io prio is indeed IOPRIO_CLASS_NONE/0 and changing
> >> IOPRIO_DEFAULT to that value removes the differences you observed.
> > 
> > Yes, I think that would make sence although as I explain above this is
> > somewhat independent to what the default IO priority behavior should be.
> I am OK with the use of the task CPU priority/ionice value as the default when
> no other ioprio is set for a bio using the user aio_reqprio or ioprio_set(). If
> this relies on task_nice_ioclass() as it is today (I see no reason to change
> that), then the default class for regular tasks remains IOPRIO_CLASS_BE as is
> defined by IOPRIO_DEFAULT.

Yes, good.

> But to avoid the performance regression you observed, we really need to be 100%
> sure that all bios have their ->bi_ioprio field correctly initialized. Something
> like:
> 
> void bio_set_effective_ioprio(struct *bio)
> {
> 	switch (IOPRIO_PRIO_CLASS(bio->bi_ioprio)) {
> 	case IOPRIO_CLASS_RT:
> 	case IOPRIO_CLASS_BE:
> 	case IOPRIO_CLASS_IDLE:
> 		/*
> 		 * the bio ioprio was already set from an aio kiocb ioprio
> 		 * (aio->aio_reqprio) or from the issuer context ioprio if that
> 		 * context used ioprio_set().
> 		 */;
> 		return;
> 	case IOPRIO_CLASS_NONE:
> 	default:
> 		/* Use the current task CPU priority */
> 		bio->ioprio =
> 			IOPRIO_PRIO_VALUE(task_nice_ioclass(current),
> 					  task_nice_ioprio(current));
> 		return;
> 	}
> }
> 
> being called before a bio is inserted in a scheduler or bypass inserted in the
> dispatch queues should result in all BIOs having an ioprio that is set to
> something other than IOPRIO_CLASS_NONE. And the obvious place may be simply at
> the beginning of submit_bio(), before submit_bio_noacct() is called.
> 
> I am tempted to argue that block device drivers should never see any req
> with an ioprio set to IOPRIO_CLASS_NONE, which means that no bio should
> ever enter the block stack with that ioprio either. With the above
> solution, bios from DM targets submitted with submit_bio_noacct() could
> still have IOPRIO_CLASS_NONE...  So would submit_bio_noacct() be the
> better place to call the effective ioprio helper ?

Yes, I also think it would be the cleanest if we made sure bio->ioprio is
always set to some value other than IOPRIO_CLASS_NONE. I'll see how we can
make that happen in the least painful way :). Thanks for your input!

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux