Re: Funky new vblank counter regressions in Linux 4.4-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/25/2015 08:38 PM, Alex Deucher wrote:
On Wed, Nov 25, 2015 at 1:21 PM, Mario Kleiner
<mario.kleiner.de@xxxxxxxxx> wrote:
On 11/25/2015 06:58 PM, Ville Syrjälä wrote:

On Wed, Nov 25, 2015 at 06:24:13PM +0100, Mario Kleiner wrote:

On 11/23/2015 09:24 PM, Ville Syrjälä wrote:

On Mon, Nov 23, 2015 at 06:58:34PM +0100, Mario Kleiner wrote:



On 11/23/2015 04:51 PM, Ville Syrjälä wrote:

On Mon, Nov 23, 2015 at 04:23:21PM +0100, Mario Kleiner wrote:

On 11/20/2015 04:34 PM, Ville Syrjälä wrote:

On Fri, Nov 20, 2015 at 04:24:50PM +0100, Mario Kleiner wrote:


...
Ok, but why would that be a bad thing? I think we want it to think it
is
in the previous frame if it is called outside the vblank irq context.
The only reason we fudge it to the next frames vblank if i vblank irq
is
because we know the vblank irq handler we are executing atm. was
meant
to execute within the upcoming vblank for the next frame, so we fudge
the scanout positions and thereby timestamp to correspond to that new
frame. But if something called outside irq context it should get a
scanout position/timestamp that corresponds to "reality".


It would be a bad thing since it would cause the timestamp to jump
backwards, and that would also cause the frame count guesstimate to go
backwards.


But only if we don't use the dev->driver->get_vblank_counter() method,
which we try to use on AMD.


Well, if you do it that way then you have the problem of the hw counter
seeming to jump forward by one after crossing the start of vblank (when
compared to the value you sampled when you processed the early vblank
interrupt).


Ok, finally i see the bad scenario that wouldn't get prevented by our
current locking with the new vblank counting in the core. The vblank
enable path is safe due to locking and discounting of redundant
timestamps etc. But the disable path could go wrong:

1. Vblank irq fires, drm_handle_vblank() -> drm_update_vblank_count(),
updates timestamps and counts "as if" in vblank -> incremented vblank
count and timestamp now set in the future.

2. After vblank irq finishes, but just before leading edge of vblank,
vblank_disable_and_save() executes, doesn't get bumped timestamp or
count because before vblank and not in vblank irq. Now
drm_update_vblank_count() would process a
"new" timestamp and count from the past and we'd have time and counts
going backwards, and bad things would happen.

I haven't observed such a thing happening during testing so far,
probably because the time window in which it could happen is tiny, but
given how awfully bad it would be, it needs to be prevented.

I had a look at the description of the Vblank irq in the "M76 Register
Reference Guide" for older asics and the description suggests that the
vblank irq fires when the crtc's line buffer is finished reading pixel
data from the scanout buffer in memory for a frame, ie., when the line
buffer read "enters" vblank.


Hmm. Does that mean there's always at least one fullscreen plane enabled
in the hw? As in you can't turn off the primary plane or make it smaller
than the active video area? Othwewise it sounds like you'd could either
not get it at all, or get it somewhere in the middle of the screen.


It says "Interrupt that can be programmed to be generated by the
primary display controller's line buffer logic either when the
source image line counter is not requesting any active
display data (i.e. in the vertical blank) or the output CRTC
timing generator is within the vertical blanking region."

So my statements were my interpretation of this quote, so i can make some
sense out of the vblank irq behaviour. I guess Alex or Harry would know? The
M76 reference refers to some older asics, i just assume it is the same for
the current ones, given that observed behaviour would be consistent with the
line buffer causing this lead of a couple of scanlines. I see about 2
scanlines on DCE4 and about 3 scanlines on DCE3. I don't know how big the
line buffer is, how quickly it refills etc., but it sounds reasonable.

The size of the line buffer varies by generation, but the LB logic is
still responsible for generating the vblank interrupt even on newer
hw.

Alex


Thanks for the pointer. I digged through the evergreen_line_buffer_adjust() function and its siblings from classic to DCE11. Seems the line buffer capacity goes up to a max 16384 pixels on single display, e.g., evergreen, and more like 8192 pixels on DCE8+? It's expressed as 8192 * 2 and talking about different line buffer partitions in some places. At least for DCE4+ i can see the sizes from the code, but not for earlier gens.

To find the proper value of how much earlier i need to fudge/place the start of vblank wrt. the true start of vblank in the code, i think i'll have to adjust for line buffer size.

The worst case would be that the line buffer can be ahead of our scanout positions by the full height of the line buffer a la: fudgelines = lb_size / mode->crtc_hdisplay

I think i could calculate this in the xxx_line_buffer_adjust() functions and store it in radeon_crtc for use in the fudging code. Atm. i'm just using a hard-coded "10" which so far worked for typical display modes. Assuming something like a 640x480 mode i'd need to set a constant of 26 if i'd want to account for full line buffer size as worst case. So i wonder what a good value would be to be safe but at the same time to keep this extended vblank small? E.g., maybe a smaller value than full line buffer height would do because due to the way watermarks for lb refill are set it can't get ahead by more than a few scanlines << total size?

I also realized that we need to extend the virtual vblank by the same fudged number of scanlines when programming the page flip. Our hw settings make it flip at leading edge of true vblank if the new bo address is latched in time. To keep all the logic in userspace working properly we need to guarantee that now it will flip at the start of the fudged vblank at latest, otherwise delays the flip one frame. So i'd add a check for being in the fudged "vblank before the true vblank" to the radeon_flip_work_func() before programming the flip and udelay() if we'd end up in this forbidden zone. This problem already exists in the old code, but probably doesn't happen frequently (or ever) in practice, at least i couldn't easily provoke it, because usually we are too slow in scheduling the flip to fit into the time window for this race. But future optimizations or faster machines might break it.

Thoughts?
-mario
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux