Re: [PATCH v2 3/3] drm/vc4: Add a load tracker to prevent HVS underflow errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Boris Brezillon <boris.brezillon@xxxxxxxxxxx> writes:

> Hi Eric,
>
> On Tue, 30 Oct 2018 16:12:55 -0700
> Eric Anholt <eric@xxxxxxxxxx> wrote:
>> > +static void vc4_load_tracker_destroy_state(struct drm_private_obj *obj,
>> > +					   struct drm_private_state *state)
>> > +{
>> > +	struct vc4_load_tracker_state *load_state;
>> > +
>> > +	load_state = to_vc4_load_tracker_state(state);
>> > +	kfree(load_state);
>> > +}  
>> 
>> Optional: just kfree(state) for simplicity.
>
> Hm, not sure that's a good idea. kfree(state) works as long as
> drm_private_state is the first field in vc4_load_tracker_state, but it
> sounds a bit fragile.
>
> I can do
>
> 	kfree(to_vc4_load_tracker_state(state));
>
> if you prefer.

I said optional for that reason :)  Just keep it as is.

>> > +static void vc4_plane_calc_load(struct drm_plane_state *state)
>> > +{
>> > +	unsigned int hvs_load_shift, vrefresh, i;
>> > +	struct drm_framebuffer *fb = state->fb;
>> > +	struct vc4_plane_state *vc4_state;
>> > +	struct drm_crtc_state *crtc_state;
>> > +	unsigned int vscale_factor;
>> > +
>> > +	vc4_state = to_vc4_plane_state(state);
>> > +	crtc_state = drm_atomic_get_existing_crtc_state(state->state,
>> > +							state->crtc);
>> > +	vrefresh = drm_mode_vrefresh(&crtc_state->adjusted_mode);
>> > +
>> > +	/* The HVS is able to process 2 pixels/cycle when scaling the source,
>> > +	 * 4 pixels/cycle otherwise.
>> > +	 * Alpha blending step seems to be pipelined and it's always operating
>> > +	 * at 4 pixels/cycle, so the limiting aspect here seems to be the
>> > +	 * scaler block.
>> > +	 * HVS load is expressed in clk-cycles/sec (AKA Hz).
>> > +	 */
>> > +	if (vc4_state->x_scaling[0] != VC4_SCALING_NONE ||
>> > +	    vc4_state->x_scaling[1] != VC4_SCALING_NONE ||
>> > +	    vc4_state->y_scaling[0] != VC4_SCALING_NONE ||
>> > +	    vc4_state->y_scaling[1] != VC4_SCALING_NONE)
>> > +		hvs_load_shift = 1;
>> > +	else
>> > +		hvs_load_shift = 2;
>> > +
>> > +	vc4_state->membus_load = 0;
>> > +	vc4_state->hvs_load = 0;
>> > +	for (i = 0; i < fb->format->num_planes; i++) {
>> > +		unsigned long pixels_load;  
>> 
>> I'm scared any time I see longs.  Do you want 32 or 64 bits here?
>
> I just assumed a 32bit unsigned var would be enough, so unsigned long
> seemed just fine. I can use u32 or u64 if you prefer.

Yes, please.  See also Maxime's recent trouble with a 64-bit kernel.

>> > +		/* Even if the bandwidth/plane required for a single frame is
>> > +		 *
>> > +		 * vc4_state->src_w[i] * vc4_state->src_h[i] * cpp * vrefresh
>> > +		 *
>> > +		 * when downscaling, we have to read more pixels per line in
>> > +		 * the time frame reserved for a single line, so the bandwidth
>> > +		 * demand can be punctually higher. To account for that, we
>> > +		 * calculate the down-scaling factor and multiply the plane
>> > +		 * load by this number. We're likely over-estimating the read
>> > +		 * demand, but that's better than under-estimating it.
>> > +		 */
>> > +		vscale_factor = DIV_ROUND_UP(vc4_state->src_h[i],
>> > +					     vc4_state->crtc_h);
>> > +		pixels_load = vc4_state->src_w[i] * vc4_state->src_h[i] *
>> > +			      vscale_factor;  
>> 
>> If we're upscaling (common for video, right?), aren't we under-counting
>> the cost?  You need to scale/colorspace-convert crtc_w * crtc_h at 2
>> pixels per cycle.
>
> That's not entirely clear to me. I'm not sure what the scaler does when
> upscaling. Are the same pixels read several times from the memory? If
> that's the case, then the membus load should indeed be based on the
> crtc_w,h.

I'm going to punt on this question because that would be a *lot* of
verilog tracing to figure out for me (and I'm not sure I'd even trust
what I came up with).

> Also, when the spec says the HVS can process 4pixels/cycles, is it 4
> input pixels or 4 output pixels per cycle?

Well, it's 4 pixels per cycle when not scaling, so both :)

I think the scaling pipeline is doing two output pixels per cycle.
Nothing else would make sense to me.

Attachment: signature.asc
Description: PGP signature

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux