On Tue, 27 Sep 2022, Jan Kara wrote: > On Mon 26-09-22 20:39:03, Hugh Dickins wrote: > > So my thinking was that instead of having multiple counters, we'd have just > two - one counting completions and the other one counting wakeups and if > completions - wakeups > batch, we search for waiters in the wait queues, > wake them up so that 'wakeups' counter catches up. That also kind of > alleviates the 'wake_index' issue because racing updates to it will lead to > reordering of wakeups but not to lost wakeups, retries, or anything. > > I also agree with your wake_up_nr_return() idea below, that is part of this > solution (reliably waking given number of waiters) and in fact I have > already coded that yesterday while thinking about the problem ;) Great - I'm pleasantly surprised to have been not so far off, and we seem to be much in accord. (What I called wake_up_nr_return() can perfectly well be wake_up_nr() itself: I had just been temporarily avoiding a void to int change in a header file, recompiling the world.) Many thanks for your detailed elucidation of the batch safety, in particular: I won't pretend to have absorbed it completely yet, but it's there in your mail for me and all of us to refer back to. > > TBH I have not tested this one outside of that experiment: would you > > prefer this patch to my first one, I test and sign this off and send? > > Yes, actually this is an elegant solution. It has the same inherent > raciness as your waitqueue_active() patch so wakeups could be lost even > though some waiters need them but that seems pretty unlikely. So yes, if > you can submit this, I guess this is a good band aid for the coming merge > window. No problem in the testing, the v2 patch follows now. > > > > 2) Revert Yu Kuai's original fix 040b83fcecfb8 ("sbitmap: fix possible io > > > hung due to lost wakeup") and my fixup 48c033314f37 ("sbitmap: Avoid leaving > > > waitqueue in invalid state in __sbq_wake_up()"). But then Keith would have > > > to redo his batched accounting patches on top. > > > > I know much too little to help make that choice. > > Yeah, I guess it is Jens' call in the end. I'm fine with both options. > > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR