On Thu, May 03, 2018 at 09:58:56AM -0700, Rodrigo Vivi wrote: > On Wed, May 02, 2018 at 03:31:15PM -0700, Tarun Vyas wrote: > > On Wed, May 02, 2018 at 01:04:06PM -0700, Vivi, Rodrigo wrote: > > > On Wed, May 02, 2018 at 09:51:43PM +0300, Ville Syrjälä wrote: > > > > On Wed, May 02, 2018 at 11:19:14AM -0700, Tarun Vyas wrote: > > > > > On Mon, Apr 30, 2018 at 10:19:33AM -0700, Rodrigo Vivi wrote: > > > > > > On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote: > > > > > > > From: Tarun <tarun.vyas@xxxxxxxxx> > > > > > > > > > > > > > > The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, then > > > > > > > the pipe_update_start call schedules itself out to check back later. > > > > > > > > > > > > > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but > > > > > > > lags w.r.t core kernel code, hot plugging an external display triggers > > > > > > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A > > > > > > > closer analysis reveals that we try to read the scanline 3 times and > > > > > > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL > > > > > > > stuck @ 1599. This issue is not seen on upstream kernels, b/c for *some* > > > > > > > reason we loop inside intel_pipe_update start for ~2+ msec which in this > > > > > > > case is more than enough to exit PSR fully, hence an *unstuck* PIPEDSL > > > > > > > counter, hence no error. On the other hand, the ChromeOS kernel spends > > > > > > > ~1.1 msec looping inside intel_pipe_update_start and hence errors out > > > > > > > b/c the source is still in PSR. > > > > > > > > > > > > > > Regardless, we should wait for PSR exit (if PSR is supported and active > > > > > > > on the current pipe) before reading the PIPEDSL, b/c if we haven't > > > > > > > fully exited PSR, then checking for vblank evasion isn't actually > > > > > > > applicable. > > > > > > > > > > > > > > This scenario applies to a configuration with an additional pipe, > > > > > > > as of now. > > > > > > > > > > > > I honestly believe you picking the wrong culprit here. By "coincidence". > > > > > > PSR will allow DC state with screen on and DC state will mess up with all > > > > > > registers reads.... > > > > > > > > > > > > probably what you are missing you your kernel is some power domain > > > > > > grab that would keep DC_OFF and consequently a sane read of these > > > > > > registers. > > > > > > > > > > > > Maybe Imre has a quick idea of what you could be missing on your kernel > > > > > > that we already have on upstream one. > > > > > > > > > > > > Thanks, > > > > > > Rodrigo. > > > > > > > > > > > Thanks for the quick response Rodrigo ! > > > > > Some key observations based on my experiments so far: > > > > > for (;;) { > > > > > /* > > > > > * prepare_to_wait() has a memory barrier, which guarantees > > > > > * other CPUs can see the task state update by the time we > > > > > * read the scanline. > > > > > */ > > > > > prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE); > > > > > > > > > > scanline = intel_get_crtc_scanline(crtc); > > > > > if (scanline < min || scanline > max) > > > > > break; > > > > > > > > > > if (timeout <= 0) { > > > > > DRM_ERROR("Potential atomic update failure on pipe %c\n", > > > > > pipe_name(crtc->pipe)); > > > > > break; > > > > > } > > > > > > > > > > local_irq_enable(); > > > > > > > > > > timeout = schedule_timeout(timeout); > > > > > > > > > > local_irq_disable(); > > > > > } > > > > > 1. In the above loop inside pipe_update_start, the *first time*, we read the PIPEDSL, with PSR1 and external display connected, it always reads 1599, for *both* the kernels(upstream and ChromeOS-4.4) . The PSR_STATUS also reads the exact same for *both* kernels and shows that we haven't *fully* exited PSR. > > > > > > > > > > 2. The difference between the two kernels comes after this first read of the PIPEDSL. ChromeOS-4.4 spends ~1 msec inside that loop and upstream spends ~2msec. I suspect that it is because of the scheduling changes between the two kernels, b/c I can't find any i915 specific code running in that loop, except for vblank processing. > > > > > > > > > > 3. So to summarize it, both the kernels are in the same state w.r.t PSR and PIPEDSL value when they read the PIPEDSL for the first time inside the loop. *When* the kernels *transition* to a *full PSR exit* is what is differing. > > > > > > Oh! So you really are getting reliable counters.... > > > > > > > > > > > > > My rationale for this patch is that, the pipe_update_start function is meant to evade 100 usec before a vblank, but, *if* we haven't *fully* exited PSR (which is true for both the kernels for the first PIPEDSL read), then vblank evasion is *not applicable* b/c the PIPEDSL will be messed up. So we shouldn't bother evading vblank until we have fully exited PSR. > > > > > > > > Yeah, I think this is the right direction. The problem really is the > > > > extra vblank pulse that the hardware generates (or at least can > > > > generate depending on a chicken bit) when it exits PSR. We have no > > > > control over when that happens and hence we have no control over when > > > > the registers get latched. And yet we still have to somehow prevent > > > > the register latching from occurring while we're in middle of > > > > reprogramming them. > > > > > > I see the problem now. Thanks for the explanation. > > > > > > > > > > > There are a couple of ways to avoid this: > > > > 1) Set the chicken bit so that we don't get the vblank pulse. The > > > > pipe should restart from the vblank start, so we would have one > > > > full frame to reprogam the registers. Howver IIRC DK told me > > > > there is no way to fully eliminate it in all cases so this > > > > option is probably out. There was also some implication for FBC > > > > which I already forgot. > > > > 2) Make sure we've exited PSR before repgrogamming the registers > > > > (ie. what you do). > > > > 3) Use the DOUBLE_BUFFER_CTL to prevent the extra vblank pulse from > > > > latching the registers while we're still reprogramming them. > > > > This feature only exists on SKL+ so is not a solution for > > > > HSW/BDW. But maybe HSW/BDW didn't even have the extra vblank > > > > pulse? > > > > > > > > Option 2) does provide a consistent behaviour on all platforms, so I > > > > do kinda like it. It also avoids a bigger reword on account of the > > > > DOUBLE_BUFFER_CTL. I do think we'll have to start using > > > > DOUBLE_BUFFER_CTL anyway due to other issues, but at least this way > > > > we don't block PSR progress on that work. > > > > > > My vote is for the option 2. Seems more straighforward and more broad. > > > > > > DK? > > > > > > My only request on the patch itself would be to create a function > > > on intel_psr.c intel_psr_wait_for_idle... or something like this > > > and put the register wait logic inside it instead of spreading > > > the psr code around. > > > > > > Thanks, > > > Rodrigo. > > > > > > > > > > > -- > > > > Ville Syrjälä > > > > Intel > > Thanks for the comments, Ville and Rodrigo. I'll rework this to move the wait to intel_psr.c. There is a psr_wait_for_idle() in there, but there are some PSR locks being passed around inside it (eventually released by the caller). Also,the max timeout specified there is 50 msec which might be way too much ? > > ouch! that function is ugly.... unlock than lock back again... > (specially unlock without any assert locked... :/) > > If you can improve that or split in a way that we reuse some code it would be nice... > Yes, I was also considering splitting it up. I'll rework and send it out. Thanks ! -Tarun > > > > Best, > > Tarun _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx