On 6/28/18 7:26 AM, Josef Bacik wrote: > On Wed, Jun 27, 2018 at 01:24:55PM -0600, Jens Axboe wrote: >> On 6/27/18 1:20 PM, Josef Bacik wrote: >>> On Wed, Jun 27, 2018 at 01:06:31PM -0600, Jens Axboe wrote: >>>> On 6/25/18 9:12 AM, Josef Bacik wrote: >>>>> +static void __blkcg_iolatency_throttle(struct rq_qos *rqos, >>>>> + struct iolatency_grp *iolat, >>>>> + spinlock_t *lock, bool issue_as_root, >>>>> + bool use_memdelay) >>>>> + __releases(lock) >>>>> + __acquires(lock) >>>>> +{ >>>>> + struct rq_wait *rqw = &iolat->rq_wait; >>>>> + unsigned use_delay = atomic_read(&lat_to_blkg(iolat)->use_delay); >>>>> + DEFINE_WAIT(wait); >>>>> + bool first_block = true; >>>>> + >>>>> + if (use_delay) >>>>> + blkcg_schedule_throttle(rqos->q, use_memdelay); >>>>> + >>>>> + /* >>>>> + * To avoid priority inversions we want to just take a slot if we are >>>>> + * issuing as root. If we're being killed off there's no point in >>>>> + * delaying things, we may have been killed by OOM so throttling may >>>>> + * make recovery take even longer, so just let the IO's through so the >>>>> + * task can go away. >>>>> + */ >>>>> + if (issue_as_root || fatal_signal_pending(current)) { >>>>> + atomic_inc(&rqw->inflight); >>>>> + return; >>>>> + } >>>>> + >>>>> + if (iolatency_may_queue(iolat, &wait, first_block)) >>>>> + return; >>>>> + >>>>> + do { >>>>> + prepare_to_wait_exclusive(&rqw->wait, &wait, >>>>> + TASK_UNINTERRUPTIBLE); >>>>> + >>>>> + iolatency_may_queue(iolat, &wait, first_block); >>>>> + first_block = false; >>>>> + >>>>> + if (lock) { >>>>> + spin_unlock_irq(lock); >>>>> + io_schedule(); >>>>> + spin_lock_irq(lock); >>>>> + } else { >>>>> + io_schedule(); >>>>> + } >>>>> + } while (1); >>>> >>>> So how does this wait loop ever exit? >>>> >>> >>> Sigh, I cleaned this up from what we're using in production and did it poorly, >>> I'll fix it up. Thanks, >> >> Also may want to consider NOT using exclusive add if first_block == false, as >> you'll end up at the tail of the waitqueue after sleeping and being denied. >> This is similar to the wbt change I posted last week. >> > > This isn't how it works though. You aren't removed from the list until you do > finish_wait(), so you don't lose your spot on the list. We only get added to > the end of the list if > > if (list_empty(&wq_entry->entry)) > > otherwise nothing changes. I missed that you don't do finish_wait() in the loop, I had played with that to see if it fixes things. But yeah, as it stands, you are right. >> For may_queue(), your wq_has_sleeper() is also going to be always true >> inside your loop, since you call it after doing the prepare_to_wait() >> which adds you to the queue. That's why wbt does the list checks, but >> it'd be nicer to have a wq_has_other_sleepers() for that. So your >> first iolatency_may_queue() inside the loop will always be false. > > Ah yeah that's a good point, I'll go back to using what you had to catch that > case. Thanks, Basically we need to do the same thing in wbt and blk-iolatency for this, so we should sync them up. -- Jens Axboe