Re: [RFC PATCH] drm: Fix off-by-one races on vblank disable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jesse, cc'ing you in case you have thoughts about this for the intel side of things?

On Nov 17, 2011, at 4:39 PM, Andrew Lutomirski wrote:

2011/11/17 Michel Dänzer <michel@xxxxxxxxxxx>:
[ Dropping intel-gfx list from CC, as it automatically rejects posts
from non-subscribers ]

On Mit, 2011-11-16 at 21:39 -0800, Andy Lutomirski wrote:
There are two possible races when disabling vblanks.  If the IRQ
fired but the hardware didn't update its counter yet, then we store
too low a hardware counter.  (Sensible hardware never does this.
Apparently not all hardware is sensible.)

The thing is, 'the' IRQ and 'the' hardware counter aren't necessarily
about the same thing. We want an IRQ which triggers at the beginning of
vertical blank, but the Radeon hardware counters increment at the
beginning of scanout, i.e. at the end of vertical blank. Does that make
the hardware 'broken' or 'not sensible'?

Perhaps not.  I think that using that counter as the return value of
get_vblank_counter is a bit unfortunate in that case, though.  Anyway,
I'm not particularly attached to the comment as written.

Am I correct in guessing that Radeon fires its vblank irq at the
beginning of vblank instead of at the end  And doesn't this mean that
Radeon has a separate issue when vblank_get happens during the vblank
interval?  A trick similar to drm_calc_vbltimestamp_from_scanoutpos
ought to be able to fix that.


The Radeon's i tested (R500, R600) fire the vblank irq even a few scanlines before the start of vblank. I think most gpu's fire very close to the beginning of vblank, because one wants to use the vblank irq to trigger some work that needs to be done within the vblank interval. Some Radeon's sometimes fire a spurious vblank irq at the moment vblank irq's get enabled, althought that could also be some tiny bug in the kms driver, maybe missing some acknowledge of a pending irq somewhere in the vblank off or on function. That's the reason for some of the fun inside drm_calc_vbltimestamp_from_scanoutpos() and drm_handle_vblank(), to associate a vblank irq and the vblank timestamps with the correct vblank interval, even if the irq handler executes a bit outside the vblank interval.


If, on the other hand, the counter updated but the IRQ didn't fire
yet, we store too high a counter.

We handled the former case with a heuristic based on timestamps and
we did not handle the latter case.  By saving a little more state,
we can handle both cases exactly: all we need to do is watch for
changes in the difference between the hardware and software vblank
counts.

I'm afraid that can't work:

Some (AFAIR also Intel) hardware resets the counter to 0 when the CRTC is disabled / enabled (e.g. for DPMS, or a modeset). That's why we ended up only counting interrupts while the IRQ is enabled, and only using the
hardware counter to fill in while the IRQ is disabled. The hardware
counter cannot be assumed to have any defined behaviour between enabling
and disabling the IRQ.

To compensate for this, the drivers call drm_vblank_pre_modeset (which
enables the IRQ, which also updates the virtual counter from the
hardware counter) before disabling the CRTC and drm_vblank_post_modeset (which disables the IRQ, which also records the hardware counter) after
enabling the CRTC.


Shouldn't this be fixable, though, by adding appropriate logic in
drm_vblank_pre_modeset and drm_vblank_post_modeset?


Would this be a lot simpler if drm had a function (or call to the
driver) to compute both the vblank count and the last vblank
timestamp?  That way the irq handler, drm_update_vblank_count,
vblank_disable_and_save, and pre/post modeset could keep everything
exactly in sync.  IIRC i915 has this capability in hardware, and if
Radeon indeed updates the frame count exactly at the end of the vblank
interval, it could compute this exactly as well.

I think Andrews idea is very close to what i proposed in Matthews RFC e-mail thread, if we use the scanout position as a "clock", except we could get rid of the calibration loop, the part i like least about my proposal, if we know for sure when each gpu increments its hardware counter.

At least intel, radeon and nouveau (hopefully soonish - patches almost ready for submission) expose the dev->driver- >get_scanout_position() hook, so we know the scanline at time of vblank counter query. If we knew at which scanline each gpu increments its counter we could simply compare against scanline at counter query time and decide if we need to add 1 to the value or not. We could make the hardware counter value look like as if it is from a hardware counter that always increments at leading edge of vblank, which i think would be consistent with what the irq driven vblank increment does.

Pseudocode (with bonus goto included!):

// To be called from vblank irq on/off routines instead of driver- >get_vblank_count(crtc):
uint32 get_effective_hw_vblank_count(crtc)
{

retry:

    uint32 scanline_pre = driver->get_scanout_position(crtc);
    uint32 hw_count = driver->get_vblank_count(crtc);
    uint32 scanline_post = driver->get_scanout_position(crtc);

if ((scanline_pre outside vblank) && (scanline_post outside vblank) {
        // Inside active scanout -> All counters stable -> no problem:
        return hw_count;
    }

    // Inside vblank -> careful!
if (scanline_where_gpu_increments_its_counter intersects interval [scanline_pre ; scanline_post]) { // Exact time of get_vblank_count() query "uncertain" wrt. hw counter increment.
	cpu_relax(); // or something similar...
        goto retry;
    }

    // Inside vblank. Query happened before or after hw increment?
    if (scanline_post < scanline_where_gpu_increments_its_counter) {
        // Query happened before hw counter increment for sure. Fake
        // the increment has already happened:
        hw_count++;
    }

    return hw_count;
}

Something like this? The worst case busy waiting time would be maybe 2 scanline durations ( < 50 microseconds) in the very rare case where we hit exactly the scanline where hw counter increments. That amount of busy waiting sounds acceptable to me, given that we'd still keep a vblank off delay of a few milliseconds to avoid many redundant on/off transitions during a typical OpenGL bufferswap and that this never gets directly called from interrupt context.

We could put this into the drm, then the drivers would need to tell the drm the scanline where increment happens, or we could replace the driver->get_vblank_count() hook in each driver with some variant of this? Drivers with such a thing could lower the vblank off delay to a few msecs, other drivers would leave it at a high value.

Thoughts?
-mario

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux