Re: [RFC] Using DC in amdgpu for upcoming GPU

Christian König <christian.koenig@xxxxxxx> · Thu, 8 Dec 2016 16:41:52 +0100

Am 08.12.2016 um 16:34 schrieb Daniel Vetter:
On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote:
Hi Daniel,

just a quick clarification in-line about "validation" inside atomic_commit.

On 2016-12-08 04:59 AM, Daniel Vetter wrote:
Hi Harry,

On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
We propose to use the Display Core (DC) driver for display support on
AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
avoid a flag day the plan is to only support uGPU initially and transition
to older ASICs gradually.

The DC component has received extensive testing within AMD for DCE8, 10, and
11 GPUs and is being prepared for uGPU. Support should be better than
amdgpu's current display support.

  * All of our QA effort is focused on DC
  * All of our CQE effort is focused on DC
  * All of our OEM preloads and custom engagements use DC
  * DC behavior mirrors what we do for other OSes

The new asic utilizes a completely re-designed atom interface, so we cannot
easily leverage much of the existing atom-based code.

We've introduced DC to the community earlier in 2016 and received a fair
amount of feedback. Some of what we've addressed so far are:

  * Self-contain ASIC specific code. We did a bunch of work to pull
    common sequences into dc/dce and leave ASIC specific code in
    separate folders.
  * Started to expose AUX and I2C through generic kernel/drm
    functionality and are mostly using that. Some of that code is still
    needlessly convoluted. This cleanup is in progress.
  * Integrated Dave and Jerome’s work on removing abstraction in bios
    parser.
  * Retire adapter service and asic capability
  * Remove some abstraction in GPIO

Since a lot of our code is shared with pre- and post-silicon validation
suites changes need to be done gradually to prevent breakages due to a major
flag day.  This, coupled with adding support for new asics and lots of new
feature introductions means progress has not been as quick as we would have
liked. We have made a lot of progress none the less.

The remaining concerns that were brought up during the last review that we
are working on addressing:

  * Continue to cleanup and reduce the abstractions in DC where it
    makes sense.
  * Removing duplicate code in I2C and AUX as we transition to using the
    DRM core interfaces.  We can't fully transition until we've helped
    fill in the gaps in the drm core that we need for certain features.
  * Making sure Atomic API support is correct.  Some of the semantics of
    the Atomic API were not particularly clear when we started this,
    however, that is improving a lot as the core drm documentation
    improves.  Getting this code upstream and in the hands of more
    atomic users will further help us identify and rectify any gaps we
    have.

Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
work on DC is public.  We're currently transitioning to a public patch
review. You can follow our progress on the amd-gfx mailing list. We value
community feedback on our work.

As an appendix I've included a brief overview of the how the code currently
works to make understanding and reviewing the code easier.

Prior discussions on DC:

  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
  *
https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html

Current version of DC:

  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7

Once Alex pulls in the latest patches:

  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7

Best Regards,
Harry

************************************************
*** Appendix: A Day in the Life of a Modeset ***
************************************************

Below is a high-level overview of a modeset with dc. Some of this might be a
little out-of-date since it's based on my XDC presentation but it should be
more-or-less the same.

amdgpu_dm_atomic_commit()
{
   /* setup atomic state */
   drm_atomic_helper_prepare_planes(dev, state);
   drm_atomic_helper_swap_state(dev, state);
   drm_atomic_helper_update_legacy_modeset_state(dev, state);

   /* create or remove targets */

   /********************************************************************
    * *** Call into DC to commit targets with list of all known targets
    ********************************************************************/
   /* DC is optimized not to do anything if 'targets' didn't change. */
   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
   {
     /******************************************************************
      * *** Build context (function also used for validation)
      ******************************************************************/
     result = core_dc->res_pool->funcs->validate_with_context(
                                core_dc,set,target_count,context);
I can't dig into details of DC, so this is not a 100% assessment, but if
you call a function called "validate" in atomic_commit, you're very, very
likely breaking atomic. _All_ validation must happen in ->atomic_check,
if that's not the case TEST_ONLY mode is broken. And atomic userspace is
relying on that working.

This function is not really named correctly. What it does is it builds a
context and validates at the same time. In commit we simply care that it
builds the context. Validate should never fail here (since this was already
validated in atomic_check).

We call the same function at atomic_check

amdgpu_dm_atomic_check ->
	dc_validate_resources ->
		core_dc->res_pool->funcs->validate_with_context
Ah right, iirc you told me this the last time around too ;-) I guess a
great example for what I mean with rolling your own world: Existing atomic
drivers put their derived/computed/validated check into their subclasses
state structures, which means they don't need to be re-computed in
atomic_check. It also makes sure that the validation code/state
computation code between check and commit doesn't get out of sync.

As for the rest, I hear you and appreciate your feedback. Let me get back to
you on that later.
Just an added note on that: I do think that there's some driver teams
who've managed to pull a shared codebase between validation and upstream
linux (iirc some of the intel wireless drivers work like that). But it
requires careful aligning of everything, and with something fast-moving
like drm it might become real painful and not really worth it. So not
outright rejecting DC (and the code sharing you want to achieve with it)
as an idea here.

I used to have examples of such a things for other network drivers as 
well, but right now I can't find them of hand. Leave me a note if you 
need more info on existing things.

A good idea might as well be to take a look at drivers shared between 
Linux and BSD as well, cause both code bases are usually public 
available and you can see what changes during porting and what stays the 
same.

Regards,
Christian.

-Daniel

Thanks,
Harry

The only thing that you're allowed to return from ->atomic_commit is
out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
Kerneldoc expklains this.

Now the reason I bring this up (and we've discussed it at length in
private) is that DC still suffers from a massive abstraction midlayer. A
lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
timers, irq, ...) have been cleaned up, but the midlayer is still there.
And I understand why you have it, and why it's there - without some OS
abstraction your grand plan of a unified driver across everything doesn't
work out so well.

But in a way the backend stuff isn't such a big deal. It's annoying since
lots of code, and bugfixes have to be duplicated and all that, but it's
fairly easy to fix case-by-case, and as long as AMD folks stick around
(which I fully expect) not a maintainance issue. It makes it harder for
others to contribute, but then since it's mostly the leaf it's generally
easy to just improve the part you want to change (as an outsider). And if
you want to improve shared code the only downside is that you can't also
improve amd, but that's not so much a problem for non-amd folks ;-)

The problem otoh with the abstraction layer between drm core and the amd
driver is that you can't ignore if you want to refactor shared code. And
because it's an entire world of its own, it's much harder to understand
what the driver is doing (without reading it all). Some examples of what I
mean:

- All other drm drivers subclass drm objects (by embedding them) into the
   corresponding hw part that most closely matches the drm object's
   semantics. That means even when you have 0 clue about how a given piece
   of hw works, you have a reasonable chance of understanding code. If it's
   all your own stuff you always have to keep in minde the special amd
   naming conventions. That gets old real fast if you trying to figure out
   what 20+ (or are we at 30 already?) drivers are doing.

- This is even more true for atomic. Atomic has a pretty complicated
   check/commmit transactional model for updating display state. It's a
   standardized interface, and it's extensible, and we want generic
   userspace to be able to run on any driver. Fairly often we realize that
   semantics of existing or newly proposed properties and state isn't
   well-defined enough, and then we need to go&read all the drivers and
   figure out how to fix up the mess. DC has it's entirely separate state
   structures which again don't subclass the atomic core structures (afaik
   at least). Again the same problems apply that you can't find things, and
   that figuring out the exact semantics and spotting differences in
   behaviour is almost impossible.

- The trouble isn't just in reading code and understanding it correctly,
   it's also in finding it. If you have your own completely different world
   then just finding the right code is hard - cscope and grep fail to work.

- Another issue is that very often we unify semantics in drivers by adding
   some new helpers that at least dtrt for most of the drivers. If you have
   your own world then the impendance mismatch will make sure that amd
   drivers will have slightly different semantics, and I think that's not
   good for the ecosystem and kms - people want to run a lot more than just
   a boot splash with generic kms userspace, stuff like xf86-video-$vendor
   is going out of favour heavily.

Note that all this isn't about amd walking away and leaving an
unmaintainable mess behind. Like I've said I don't think this is a big
risk. The trouble is that having your own world makes it harder for
everyone else to understand the amd driver, and understanding all drivers
is very often step 1 in some big refactoring or feature addition effort.
Because starting to refactor without understanding the problem generally
doesn't work ;_) And you can't make this step 1 easier for others by
promising to always maintain DC and update it to all the core changes,
because that's only step 2.

In all the DC discussions we've had thus far I haven't seen anyone address
this issue. And this isn't just an issue in drm, it's pretty much
established across all linux subsystems with the "no midlayer or OS
abstraction layers in drivers" rule. There's some real solid reasons why
such a HAl is extremely unpopular with upstream. And I haven't yet seen
any good reason why amd needs to be different, thus far it looks like a
textbook case, and there's been lots of vendors in lots of subsystems who
tried to push their HAL.

Thanks, Daniel

     /******************************************************************
      * *** Apply safe power state
      ******************************************************************/
     pplib_apply_safe_state(core_dc);

     /****************************************************************
      * *** Apply the context to HW (program HW)
      ****************************************************************/
     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
     {
       /* reset pipes that need reprogramming */
       /* disable pipe power gating */
       /* set safe watermarks */

       /* for all pipes with an attached stream */
         /************************************************************
          * *** Programming all per-pipe contexts
          ************************************************************/
         status = apply_single_controller_ctx_to_hw(...)
         {
           pipe_ctx->tg->funcs->set_blank(...);
           pipe_ctx->clock_source->funcs->program_pix_clk(...);
           pipe_ctx->tg->funcs->program_timing(...);
           pipe_ctx->mi->funcs->allocate_mem_input(...);
           pipe_ctx->tg->funcs->enable_crtc(...);
           bios_parser_crtc_source_select(...);

           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
           pipe_ctx->opp->funcs->opp_program_fmt(...);

           stream->sink->link->link_enc->funcs->setup(...);
           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
           pipe_ctx->tg->funcs->set_blank_color(...);

           core_link_enable_stream(pipe_ctx);
           unblank_stream(pipe_ctx,

           program_scaler(dc, pipe_ctx);
         }
       /* program audio for all pipes */
       /* update watermarks */
     }

     program_timing_sync(core_dc, context);
     /* for all targets */
       target_enable_memory_requests(...);

     /* Update ASIC power states */
     pplib_apply_display_requirements(...);

     /* update surface or page flip */
   }
}

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel