On Tue, Feb 21, 2017 at 01:11:27PM +0000, Tvrtko Ursulin wrote: > > On 21/02/2017 12:43, Imre Deak wrote: > >On Tue, Feb 21, 2017 at 10:06:45AM +0000, Tvrtko Ursulin wrote: > >> > >>On 21/02/2017 09:37, Chris Wilson wrote: > >>>On Tue, Feb 21, 2017 at 11:22:12AM +0200, Imre Deak wrote: > >>>>On Mon, Feb 20, 2017 at 04:05:33PM +0000, Chris Wilson wrote: > >>>>>So that our preempt-off period doesn't grow completely unchecked, or do > >>>>>we need that 34ms loop? > >>>> > >>>>Yes, that's at least how I understand it. Scheduling away is what let's > >>>>PCODE start servicing some other request than ours or go idle. That's > >>>>in a way what we see when the preempt-enabled poll times out. > >>> > >>>I was thinking along the lines of if it was just busy/unavailable for the > >>>first 33ms that particular time, it just needed to sleep until ready. > >>>Once available, the next request ran in the expected 1ms. > >> > >>>Do you not see any value in trying a sleeping loop? Perhaps compromise > >>>and have the preempt-disable timeout increase each iteration. > > > >This fallback method would work too, but imo the worst case is what > >matters and that would be anyway the same in both cases. Because of this > >and since it's a WA I'd rather keep it simple. > > > >>Parachuting in so apologies if I misunderstood something. > >> > >>Is the issue here that we can get starved out of CPU time for more than 33ms > >>while waiting for an event? > > > >We need to actively resend the same request for this duration. > > > >>Could we play games with sched_setscheduler and maybe temporarily go > >>SCHED_DEADLINE or something? Would have to look into how to correctly > >>restore to the old state from that and from which contexts we can actually > >>end up in this wait. > > > >What would be the benefit wrt. disabling preemption? Note that since > >it's a workaround it would be good to keep it simple and close to how it > >worked on previous platforms (SKL/APL). > > It would be nicer not to relax that BUILD_BUG_ON in atomic wait for and, if > the main problem is the scheduler/CPU starvation, to see if it can be solved > differently. Even though the atomic wait here would trigger very rarely it > might be worth coming up with something nicer and generalized. > > If I understood it correctly, the difference between this wait_for call site > and the rest is that here it wants a certain number of COND checks to be > guaranteed? Yes. > The other call sites care more about checking on enter and exit. > > So in this case we want the period parameter to actually be guaranteed (or > close). This sounded like a good candidate for SCHED_DEADLINE to me. Like > wait_for_periodic(COND, TIMEOUT, INTERVAL). Could be. But this would give less guarantee than disabling preemption, as SCHED_DEADLINE still works on a best effort basis. How about increasing the timeout now (to 50ms) and trying what you suggest as a follow-up? That way we have also something for -stable. --Imre > Maybe that could get away with the second atomic loop and be a generic > solution on all platforms. > > Regards, > > Tvrtko