[RFC] Using DC in amdgpu for upcoming GPU

daniel@xxxxxxxx (Daniel Vetter) · Sun, 11 Dec 2016 21:28:28 +0100

On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> We propose to use the Display Core (DC) driver for display support on
> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> avoid a flag day the plan is to only support uGPU initially and transition
> to older ASICs gradually.

Bridgeman brought it up a few times that this here was the question - it's
kinda missing a question mark, hard to figure this out ;-). I'd say for
upstream it doesn't really matter, but imo having both atomic and
non-atomic paths in one driver is one world of hurt and I strongly
recommend against it, at least if feasible. All drivers that switched
switched in one go, the only exception was i915 (it took much longer than
we ever feared, causing lots of pain) and nouveau (which only converted
nv50+, but pre/post-nv50 have always been two almost completely separate
worlds anyway).

> The DC component has received extensive testing within AMD for DCE8, 10, and
> 11 GPUs and is being prepared for uGPU. Support should be better than
> amdgpu's current display support.
> 
>  * All of our QA effort is focused on DC
>  * All of our CQE effort is focused on DC
>  * All of our OEM preloads and custom engagements use DC
>  * DC behavior mirrors what we do for other OSes
> 
> The new asic utilizes a completely re-designed atom interface, so we cannot
> easily leverage much of the existing atom-based code.
> 
> We've introduced DC to the community earlier in 2016 and received a fair
> amount of feedback. Some of what we've addressed so far are:
> 
>  * Self-contain ASIC specific code. We did a bunch of work to pull
>    common sequences into dc/dce and leave ASIC specific code in
>    separate folders.
>  * Started to expose AUX and I2C through generic kernel/drm
>    functionality and are mostly using that. Some of that code is still
>    needlessly convoluted. This cleanup is in progress.
>  * Integrated Dave and Jeromeâ??s work on removing abstraction in bios
>    parser.
>  * Retire adapter service and asic capability
>  * Remove some abstraction in GPIO
> 
> Since a lot of our code is shared with pre- and post-silicon validation
> suites changes need to be done gradually to prevent breakages due to a major
> flag day.  This, coupled with adding support for new asics and lots of new
> feature introductions means progress has not been as quick as we would have
> liked. We have made a lot of progress none the less.
> 
> The remaining concerns that were brought up during the last review that we
> are working on addressing:
> 
>  * Continue to cleanup and reduce the abstractions in DC where it
>    makes sense.
>  * Removing duplicate code in I2C and AUX as we transition to using the
>    DRM core interfaces.  We can't fully transition until we've helped
>    fill in the gaps in the drm core that we need for certain features.
>  * Making sure Atomic API support is correct.  Some of the semantics of
>    the Atomic API were not particularly clear when we started this,
>    however, that is improving a lot as the core drm documentation
>    improves.  Getting this code upstream and in the hands of more
>    atomic users will further help us identify and rectify any gaps we
>    have.

Ok so I guess Dave is typing some more general comments about
demidlayering, let me type some guidelines about atomic. Hopefully this
all materializes itself a bit better into improved upstream docs, but meh.

Step 0: Prep

So atomic is transactional, but it's not validate + rollback or commit,
but duplicate state, validate and then either throw away or commit.
There's a few big reasons for this: a) partial atomic updates - if you
duplicate it's much easier to check that you have all the right locks b)
kfree() is much easier to check for correctness than a rollback code and
c) atomic_check functions are much easier to audit for invalid changes to
persistent state.

Trouble is that this seems a bit unusual compared to all other approaches,
and ime (from the drawn-out i915 conversion) you really don't want to mix
things up. Ofc for private state you can roll back (e.g. vc4 does that for
the drm_mm allocator thing for scanout slots or whatever it is), but it's
trivial easy to accidentally check the wrong state or mix them up or
something else bad.

Long story short, I think step 0 for DC is to split state from objects,
i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. And
all the back-end functions need to take both the object and the state
explicitly.

This is a bit a pain to do, but should be pretty much just mechanical. And
imo not all of it needs to happen before DC lands in upstream, but see
above imo that half-converted state is postively horrible. This should
also not harm cross-os reuse at all, you can still store things together
on os where that makes sense.

Guidelines for amdgpu atomic structures

drm atomic stores everything in state structs on plane/connector/crtc.
This includes any property extensions or anything else really, the entire
userspace abi is built on top of this. Non-trivial drivers are supposed to
subclass these to store their own stuff, so e.g.

amdgpu_plane_state {
	struct drm_plane_state base;

	/* amdgpu glue state and stuff that's linux-specific, e.g.
	 * property values and similar things. Note that there's strong
	 * push towards standardizing properties and stroing them in the
	 * drm_*_state structs. */

	struct dc_surface_state surface_state;

	/* other dc states that fit to a plane */
};

Yes not everything will fit 1:1 in one of these, but to get started I
strongly recommend to make them fit (maybe with reduced feature sets to
start out). Stuff that is shared between e.g. planes, but always on the
same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers that
are assignable to a plane.

Of course atomic also supports truly global resources, for that you need
to subclass drm_atomic_state. Currently msm and i915 do that, and probably
best to read those structures as examples until I've typed the docs. But I
expect that especially for planes a few dc_*_state structs will stay in
amdgpu_*_state.

Guidelines for atomic_check

Please use the helpers as much as makes sense, and put at least the basic
steps that from drm_*_state into the respective dc_*_state functional
block into the helper callbacks for that object. I think basic validation
of individal bits (as much as possible, e.g. if you just don't support
e.g. scaling or rotation with certain pixel formats) should happen in
there too. That way when we e.g. want to check how drivers corrently
validate a given set of properties to be able to more strictly define the
semantics, that code is easy to find.

Also I expect that this won't result in code duplication with other OS,
you need code to map from drm to dc anyway, might as well check&reject the
stuff that dc can't even represent right there.

The other reason is that the helpers are good guidelines for some of the
semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the right
answer after atomic_check. If it doesn't, then you're driver doesn't
follow atomic. If you completely roll your own this becomes much harder to
assure.

Of course extend it all however you want, e.g. by adding all the global
optimization and resource assignment stuff after initial per-object
checking has been done using the helper infrastructure.

Guidelines for atomic_commit

Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also,
your atomic_commit should pretty much match the helper one, except for a
custom swap_state to handle all your globally shared specia dc_*_state
objects. Everything hw specific should be in atomic_commit_tail.

Wrt the hw commit itself, for the modeset step just roll your own. That's
the entire point of atomic, and atm both i915 and nouveau exploit this
fully. Besides a bit of glue there shouldn't be much need for
linux-specific code here - what you need is something to fish the right
dc_*_state objects and give it your main sequencer functions. What you
should make sure though is that only ever do a modeset when that was
signalled, i.e. please use drm_crtc_needs_modeset to control that part.
Feel free to wrap up in a dc_*_needs_modeset for better abstraction if
that's needed.

I do strongly suggest however that you implement the plane commit using
the helpers. There's really only a few ways to implement this in the hw,
and it should work everywhere.

Misc guidelines

Use the suspend/resume helpers. If your atomic can't do that, it's not
terribly good. Also, if DC can't make those fit, it's probably still too
much midlayer and its own world than helper library.

Use all the legacy helpers, again your atomic should be able to pull it
off. One exception is async plane flips (both primary and cursors), that's
atm still unsolved. Probably best to keep the old code around for just
that case (but redirect to the compat helpers for everything), see e.g.
how vc4 implements cursors.

Most imporant of all

Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic
driver for which I don't remember a single discussion about some detail,
at least not with any of the DAL folks. Michel&Alex asked some questions
sometimes, but that indirection is bonghits and the defeats the point of
upstream: Direct cross-vendor collaboration to get shit done. Please make
it happen.

Oh and I pretty much assume Harry&Tony are volunteered to review atomic
docs ;-)

Cheers, Daniel

> 
> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
> work on DC is public.  We're currently transitioning to a public patch
> review. You can follow our progress on the amd-gfx mailing list. We value
> community feedback on our work.
> 
> As an appendix I've included a brief overview of the how the code currently
> works to make understanding and reviewing the code easier.
> 
> Prior discussions on DC:
> 
>  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>  *
> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
> 
> Current version of DC:
> 
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> 
> Once Alex pulls in the latest patches:
> 
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> 
> Best Regards,
> Harry
> 
> 
> ************************************************
> *** Appendix: A Day in the Life of a Modeset ***
> ************************************************
> 
> Below is a high-level overview of a modeset with dc. Some of this might be a
> little out-of-date since it's based on my XDC presentation but it should be
> more-or-less the same.
> 
> amdgpu_dm_atomic_commit()
> {
>   /* setup atomic state */
>   drm_atomic_helper_prepare_planes(dev, state);
>   drm_atomic_helper_swap_state(dev, state);
>   drm_atomic_helper_update_legacy_modeset_state(dev, state);
> 
>   /* create or remove targets */
> 
>   /********************************************************************
>    * *** Call into DC to commit targets with list of all known targets
>    ********************************************************************/
>   /* DC is optimized not to do anything if 'targets' didn't change. */
>   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>   {
>     /******************************************************************
>      * *** Build context (function also used for validation)
>      ******************************************************************/
>     result = core_dc->res_pool->funcs->validate_with_context(
>                                core_dc,set,target_count,context);
> 
>     /******************************************************************
>      * *** Apply safe power state
>      ******************************************************************/
>     pplib_apply_safe_state(core_dc);
> 
>     /****************************************************************
>      * *** Apply the context to HW (program HW)
>      ****************************************************************/
>     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
>     {
>       /* reset pipes that need reprogramming */
>       /* disable pipe power gating */
>       /* set safe watermarks */
> 
>       /* for all pipes with an attached stream */
>         /************************************************************
>          * *** Programming all per-pipe contexts
>          ************************************************************/
>         status = apply_single_controller_ctx_to_hw(...)
>         {
>           pipe_ctx->tg->funcs->set_blank(...);
>           pipe_ctx->clock_source->funcs->program_pix_clk(...);
>           pipe_ctx->tg->funcs->program_timing(...);
>           pipe_ctx->mi->funcs->allocate_mem_input(...);
>           pipe_ctx->tg->funcs->enable_crtc(...);
>           bios_parser_crtc_source_select(...);
> 
>           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
>           pipe_ctx->opp->funcs->opp_program_fmt(...);
> 
>           stream->sink->link->link_enc->funcs->setup(...);
>           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
>           pipe_ctx->tg->funcs->set_blank_color(...);
> 
>           core_link_enable_stream(pipe_ctx);
>           unblank_stream(pipe_ctx,
> 
>           program_scaler(dc, pipe_ctx);
>         }
>       /* program audio for all pipes */
>       /* update watermarks */
>     }
> 
>     program_timing_sync(core_dc, context);
>     /* for all targets */
>       target_enable_memory_requests(...);
> 
>     /* Update ASIC power states */
>     pplib_apply_display_requirements(...);
> 
>     /* update surface or page flip */
>   }
> }
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch