On Wed, Jun 27, 2018 at 01:24:55PM -0600, Jens Axboe wrote: > On 6/27/18 1:20 PM, Josef Bacik wrote: > > On Wed, Jun 27, 2018 at 01:06:31PM -0600, Jens Axboe wrote: > >> On 6/25/18 9:12 AM, Josef Bacik wrote: > >>> +static void __blkcg_iolatency_throttle(struct rq_qos *rqos, > >>> + struct iolatency_grp *iolat, > >>> + spinlock_t *lock, bool issue_as_root, > >>> + bool use_memdelay) > >>> + __releases(lock) > >>> + __acquires(lock) > >>> +{ > >>> + struct rq_wait *rqw = &iolat->rq_wait; > >>> + unsigned use_delay = atomic_read(&lat_to_blkg(iolat)->use_delay); > >>> + DEFINE_WAIT(wait); > >>> + bool first_block = true; > >>> + > >>> + if (use_delay) > >>> + blkcg_schedule_throttle(rqos->q, use_memdelay); > >>> + > >>> + /* > >>> + * To avoid priority inversions we want to just take a slot if we are > >>> + * issuing as root. If we're being killed off there's no point in > >>> + * delaying things, we may have been killed by OOM so throttling may > >>> + * make recovery take even longer, so just let the IO's through so the > >>> + * task can go away. > >>> + */ > >>> + if (issue_as_root || fatal_signal_pending(current)) { > >>> + atomic_inc(&rqw->inflight); > >>> + return; > >>> + } > >>> + > >>> + if (iolatency_may_queue(iolat, &wait, first_block)) > >>> + return; > >>> + > >>> + do { > >>> + prepare_to_wait_exclusive(&rqw->wait, &wait, > >>> + TASK_UNINTERRUPTIBLE); > >>> + > >>> + iolatency_may_queue(iolat, &wait, first_block); > >>> + first_block = false; > >>> + > >>> + if (lock) { > >>> + spin_unlock_irq(lock); > >>> + io_schedule(); > >>> + spin_lock_irq(lock); > >>> + } else { > >>> + io_schedule(); > >>> + } > >>> + } while (1); > >> > >> So how does this wait loop ever exit? > >> > > > > Sigh, I cleaned this up from what we're using in production and did it poorly, > > I'll fix it up. Thanks, > > Also may want to consider NOT using exclusive add if first_block == false, as > you'll end up at the tail of the waitqueue after sleeping and being denied. > This is similar to the wbt change I posted last week. > This isn't how it works though. You aren't removed from the list until you do finish_wait(), so you don't lose your spot on the list. We only get added to the end of the list if if (list_empty(&wq_entry->entry)) otherwise nothing changes. > For may_queue(), your wq_has_sleeper() is also going to be always true > inside your loop, since you call it after doing the prepare_to_wait() > which adds you to the queue. That's why wbt does the list checks, but > it'd be nicer to have a wq_has_other_sleepers() for that. So your > first iolatency_may_queue() inside the loop will always be false. Ah yeah that's a good point, I'll go back to using what you had to catch that case. Thanks, Josef