Re: [PATCH 1/2] wait: add wq_has_multiple_sleepers helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/11, Josef Bacik wrote:
>
> On Thu, Jul 11, 2019 at 03:40:06PM +0200, Oleg Nesterov wrote:
> > rq_qos_wait() inside the main loop does
> >
> > 		if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) {
> > 			finish_wait(&rqw->wait, &data.wq);
> >
> > 			/*
> > 			 * We raced with wbt_wake_function() getting a token,
> > 			 * which means we now have two. Put our local token
> > 			 * and wake anyone else potentially waiting for one.
> > 			 */
> > 			if (data.got_token)
> > 				cleanup_cb(rqw, private_data);
> > 			break;
> > 		}
> >
> > finish_wait() + "if (data.got_token)" can race with rq_qos_wake_function()
> > which does
> >
> > 	data->got_token = true;
> > 	list_del_init(&curr->entry);
> >
>
> Argh finish_wait() does __set_current_state, well that's shitty.

Hmm. I think this is irrelevant,

> data->got_token = true;
> smp_wmb()
> list_del_init(&curr->entry);
>
> and then do
>
> smp_rmb();
> if (data.got_token)
> 	cleanup_cb(rqw, private_data);

Yes, this should work,

> > and I don't really understand
> >
> > 	has_sleeper = false;
> >
> > at the end of the main loop. I think it should do "has_sleeper = true",
> > we need to execute the code above only once, right after prepare_to_wait().
> > But this is harmless.
>
> We want has_sleeper = false because the second time around we just want to grab
> the inflight counter.

I don't think so.

> Yes we should have been worken up by our special thing
> and so should already have data.got_token,

Yes. Again, unless wakeup was spurious and this needs another trivial fix.

If we can't rely on this then this code is simply broken?

> but that sort of thinking ends in
> hung boxes and me having to try to mitigate thousands of boxes suddenly hitting
> a case we didn't think was possible.  Thanks,

I can't understand this logic, but I can't argue. However, in this case I'd
suggest the patch below instead of this series.

If rq_qos_wait() does the unnecessary acquire_inflight_cb() because it can
hit a case we didn't think was possible, then why can't it do on the first
iteration for the same reason? This should equally fix the problem and
simplify the code.

In case it is not clear: no, I don't like it. Just I can't understand your
logic.

And btw... again, I won't argue, but wq_has_multiple_sleepers is badly named,
and the comments are simply wrong. It can return T if wq has no sleepers, iow
if list_empty(wq_head->head). 2/2 actualy uses !wq_has_multiple_sleepers(),
this turns the condition back into list_is_singular(), but to me this alll
looks very confusing.

Plus I too do not understand smp_mb() in this helper.

Oleg.

--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -247,7 +247,7 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 	do {
 		if (data.got_token)
 			break;
-		if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) {
+		if (acquire_inflight_cb(rqw, private_data)) {
 			finish_wait(&rqw->wait, &data.wq);
 
 			/*
@@ -260,7 +260,6 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 			break;
 		}
 		io_schedule();
-		has_sleeper = false;
 	} while (1);
 	finish_wait(&rqw->wait, &data.wq);
 }




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux