On Thu, 08 Nov 2018 08:26:49 -0800 Eric Anholt <eric@xxxxxxxxxx> wrote: > >> > +static void vc4_plane_calc_load(struct drm_plane_state *state) > >> > +{ > >> > + unsigned int hvs_load_shift, vrefresh, i; > >> > + struct drm_framebuffer *fb = state->fb; > >> > + struct vc4_plane_state *vc4_state; > >> > + struct drm_crtc_state *crtc_state; > >> > + unsigned int vscale_factor; > >> > + > >> > + vc4_state = to_vc4_plane_state(state); > >> > + crtc_state = drm_atomic_get_existing_crtc_state(state->state, > >> > + state->crtc); > >> > + vrefresh = drm_mode_vrefresh(&crtc_state->adjusted_mode); > >> > + > >> > + /* The HVS is able to process 2 pixels/cycle when scaling the source, > >> > + * 4 pixels/cycle otherwise. > >> > + * Alpha blending step seems to be pipelined and it's always operating > >> > + * at 4 pixels/cycle, so the limiting aspect here seems to be the > >> > + * scaler block. > >> > + * HVS load is expressed in clk-cycles/sec (AKA Hz). > >> > + */ > >> > + if (vc4_state->x_scaling[0] != VC4_SCALING_NONE || > >> > + vc4_state->x_scaling[1] != VC4_SCALING_NONE || > >> > + vc4_state->y_scaling[0] != VC4_SCALING_NONE || > >> > + vc4_state->y_scaling[1] != VC4_SCALING_NONE) > >> > + hvs_load_shift = 1; > >> > + else > >> > + hvs_load_shift = 2; > >> > + > >> > + vc4_state->membus_load = 0; > >> > + vc4_state->hvs_load = 0; > >> > + for (i = 0; i < fb->format->num_planes; i++) { > >> > + unsigned long pixels_load; > >> > >> I'm scared any time I see longs. Do you want 32 or 64 bits here? > > > > I just assumed a 32bit unsigned var would be enough, so unsigned long > > seemed just fine. I can use u32 or u64 if you prefer. > > Yes, please. See also Maxime's recent trouble with a 64-bit kernel. Will use u32 then. > > >> > + /* Even if the bandwidth/plane required for a single frame is > >> > + * > >> > + * vc4_state->src_w[i] * vc4_state->src_h[i] * cpp * vrefresh > >> > + * > >> > + * when downscaling, we have to read more pixels per line in > >> > + * the time frame reserved for a single line, so the bandwidth > >> > + * demand can be punctually higher. To account for that, we > >> > + * calculate the down-scaling factor and multiply the plane > >> > + * load by this number. We're likely over-estimating the read > >> > + * demand, but that's better than under-estimating it. > >> > + */ > >> > + vscale_factor = DIV_ROUND_UP(vc4_state->src_h[i], > >> > + vc4_state->crtc_h); > >> > + pixels_load = vc4_state->src_w[i] * vc4_state->src_h[i] * > >> > + vscale_factor; > >> > >> If we're upscaling (common for video, right?), aren't we under-counting > >> the cost? You need to scale/colorspace-convert crtc_w * crtc_h at 2 > >> pixels per cycle. > > > > That's not entirely clear to me. I'm not sure what the scaler does when > > upscaling. Are the same pixels read several times from the memory? If > > that's the case, then the membus load should indeed be based on the > > crtc_w,h. > > I'm going to punt on this question because that would be a *lot* of > verilog tracing to figure out for me (and I'm not sure I'd even trust > what I came up with). > > > Also, when the spec says the HVS can process 4pixels/cycles, is it 4 > > input pixels or 4 output pixels per cycle? > > Well, it's 4 pixels per cycle when not scaling, so both :) Sorry, I meant 2pixels/cycle :). > > I think the scaling pipeline is doing two output pixels per cycle. > Nothing else would make sense to me. Okay, so the HVS load should be based on crtc_w/h and the membus load should be based on src_w/h and the scaling ratio, right? _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel