[RFC] Using DC in amdgpu for upcoming GPU

christian.koenig@xxxxxxx (Christian König) · Thu, 8 Dec 2016 16:41:52 +0100

Am 08.12.2016 um 16:34 schrieb Daniel Vetter:
> On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote:
>> Hi Daniel,
>>
>> just a quick clarification in-line about "validation" inside atomic_commit.
>>
>> On 2016-12-08 04:59 AM, Daniel Vetter wrote:
>>> Hi Harry,
>>>
>>> On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
>>>> We propose to use the Display Core (DC) driver for display support on
>>>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
>>>> avoid a flag day the plan is to only support uGPU initially and transition
>>>> to older ASICs gradually.
>>>>
>>>> The DC component has received extensive testing within AMD for DCE8, 10, and
>>>> 11 GPUs and is being prepared for uGPU. Support should be better than
>>>> amdgpu's current display support.
>>>>
>>>>   * All of our QA effort is focused on DC
>>>>   * All of our CQE effort is focused on DC
>>>>   * All of our OEM preloads and custom engagements use DC
>>>>   * DC behavior mirrors what we do for other OSes
>>>>
>>>> The new asic utilizes a completely re-designed atom interface, so we cannot
>>>> easily leverage much of the existing atom-based code.
>>>>
>>>> We've introduced DC to the community earlier in 2016 and received a fair
>>>> amount of feedback. Some of what we've addressed so far are:
>>>>
>>>>   * Self-contain ASIC specific code. We did a bunch of work to pull
>>>>     common sequences into dc/dce and leave ASIC specific code in
>>>>     separate folders.
>>>>   * Started to expose AUX and I2C through generic kernel/drm
>>>>     functionality and are mostly using that. Some of that code is still
>>>>     needlessly convoluted. This cleanup is in progress.
>>>>   * Integrated Dave and Jeromeâ??s work on removing abstraction in bios
>>>>     parser.
>>>>   * Retire adapter service and asic capability
>>>>   * Remove some abstraction in GPIO
>>>>
>>>> Since a lot of our code is shared with pre- and post-silicon validation
>>>> suites changes need to be done gradually to prevent breakages due to a major
>>>> flag day.  This, coupled with adding support for new asics and lots of new
>>>> feature introductions means progress has not been as quick as we would have
>>>> liked. We have made a lot of progress none the less.
>>>>
>>>> The remaining concerns that were brought up during the last review that we
>>>> are working on addressing:
>>>>
>>>>   * Continue to cleanup and reduce the abstractions in DC where it
>>>>     makes sense.
>>>>   * Removing duplicate code in I2C and AUX as we transition to using the
>>>>     DRM core interfaces.  We can't fully transition until we've helped
>>>>     fill in the gaps in the drm core that we need for certain features.
>>>>   * Making sure Atomic API support is correct.  Some of the semantics of
>>>>     the Atomic API were not particularly clear when we started this,
>>>>     however, that is improving a lot as the core drm documentation
>>>>     improves.  Getting this code upstream and in the hands of more
>>>>     atomic users will further help us identify and rectify any gaps we
>>>>     have.
>>>>
>>>> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
>>>> work on DC is public.  We're currently transitioning to a public patch
>>>> review. You can follow our progress on the amd-gfx mailing list. We value
>>>> community feedback on our work.
>>>>
>>>> As an appendix I've included a brief overview of the how the code currently
>>>> works to make understanding and reviewing the code easier.
>>>>
>>>> Prior discussions on DC:
>>>>
>>>>   * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>>>>   *
>>>> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
>>>>
>>>> Current version of DC:
>>>>
>>>>   * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>>>
>>>> Once Alex pulls in the latest patches:
>>>>
>>>>   * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>>>
>>>> Best Regards,
>>>> Harry
>>>>
>>>>
>>>> ************************************************
>>>> *** Appendix: A Day in the Life of a Modeset ***
>>>> ************************************************
>>>>
>>>> Below is a high-level overview of a modeset with dc. Some of this might be a
>>>> little out-of-date since it's based on my XDC presentation but it should be
>>>> more-or-less the same.
>>>>
>>>> amdgpu_dm_atomic_commit()
>>>> {
>>>>    /* setup atomic state */
>>>>    drm_atomic_helper_prepare_planes(dev, state);
>>>>    drm_atomic_helper_swap_state(dev, state);
>>>>    drm_atomic_helper_update_legacy_modeset_state(dev, state);
>>>>
>>>>    /* create or remove targets */
>>>>
>>>>    /********************************************************************
>>>>     * *** Call into DC to commit targets with list of all known targets
>>>>     ********************************************************************/
>>>>    /* DC is optimized not to do anything if 'targets' didn't change. */
>>>>    dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>>>>    {
>>>>      /******************************************************************
>>>>       * *** Build context (function also used for validation)
>>>>       ******************************************************************/
>>>>      result = core_dc->res_pool->funcs->validate_with_context(
>>>>                                 core_dc,set,target_count,context);
>>> I can't dig into details of DC, so this is not a 100% assessment, but if
>>> you call a function called "validate" in atomic_commit, you're very, very
>>> likely breaking atomic. _All_ validation must happen in ->atomic_check,
>>> if that's not the case TEST_ONLY mode is broken. And atomic userspace is
>>> relying on that working.
>>>
>> This function is not really named correctly. What it does is it builds a
>> context and validates at the same time. In commit we simply care that it
>> builds the context. Validate should never fail here (since this was already
>> validated in atomic_check).
>>
>> We call the same function at atomic_check
>>
>> amdgpu_dm_atomic_check ->
>> 	dc_validate_resources ->
>> 		core_dc->res_pool->funcs->validate_with_context
> Ah right, iirc you told me this the last time around too ;-) I guess a
> great example for what I mean with rolling your own world: Existing atomic
> drivers put their derived/computed/validated check into their subclasses
> state structures, which means they don't need to be re-computed in
> atomic_check. It also makes sure that the validation code/state
> computation code between check and commit doesn't get out of sync.
>
>> As for the rest, I hear you and appreciate your feedback. Let me get back to
>> you on that later.
> Just an added note on that: I do think that there's some driver teams
> who've managed to pull a shared codebase between validation and upstream
> linux (iirc some of the intel wireless drivers work like that). But it
> requires careful aligning of everything, and with something fast-moving
> like drm it might become real painful and not really worth it. So not
> outright rejecting DC (and the code sharing you want to achieve with it)
> as an idea here.

I used to have examples of such a things for other network drivers as 
well, but right now I can't find them of hand. Leave me a note if you 
need more info on existing things.

A good idea might as well be to take a look at drivers shared between 
Linux and BSD as well, cause both code bases are usually public 
available and you can see what changes during porting and what stays the 
same.

Regards,
Christian.

> -Daniel
>
>> Thanks,
>> Harry
>>
>>
>>> The only thing that you're allowed to return from ->atomic_commit is
>>> out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
>>> Kerneldoc expklains this.
>>>
>>> Now the reason I bring this up (and we've discussed it at length in
>>> private) is that DC still suffers from a massive abstraction midlayer. A
>>> lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
>>> timers, irq, ...) have been cleaned up, but the midlayer is still there.
>>> And I understand why you have it, and why it's there - without some OS
>>> abstraction your grand plan of a unified driver across everything doesn't
>>> work out so well.
>>>
>>> But in a way the backend stuff isn't such a big deal. It's annoying since
>>> lots of code, and bugfixes have to be duplicated and all that, but it's
>>> fairly easy to fix case-by-case, and as long as AMD folks stick around
>>> (which I fully expect) not a maintainance issue. It makes it harder for
>>> others to contribute, but then since it's mostly the leaf it's generally
>>> easy to just improve the part you want to change (as an outsider). And if
>>> you want to improve shared code the only downside is that you can't also
>>> improve amd, but that's not so much a problem for non-amd folks ;-)
>>>
>>> The problem otoh with the abstraction layer between drm core and the amd
>>> driver is that you can't ignore if you want to refactor shared code. And
>>> because it's an entire world of its own, it's much harder to understand
>>> what the driver is doing (without reading it all). Some examples of what I
>>> mean:
>>>
>>> - All other drm drivers subclass drm objects (by embedding them) into the
>>>    corresponding hw part that most closely matches the drm object's
>>>    semantics. That means even when you have 0 clue about how a given piece
>>>    of hw works, you have a reasonable chance of understanding code. If it's
>>>    all your own stuff you always have to keep in minde the special amd
>>>    naming conventions. That gets old real fast if you trying to figure out
>>>    what 20+ (or are we at 30 already?) drivers are doing.
>>>
>>> - This is even more true for atomic. Atomic has a pretty complicated
>>>    check/commmit transactional model for updating display state. It's a
>>>    standardized interface, and it's extensible, and we want generic
>>>    userspace to be able to run on any driver. Fairly often we realize that
>>>    semantics of existing or newly proposed properties and state isn't
>>>    well-defined enough, and then we need to go&read all the drivers and
>>>    figure out how to fix up the mess. DC has it's entirely separate state
>>>    structures which again don't subclass the atomic core structures (afaik
>>>    at least). Again the same problems apply that you can't find things, and
>>>    that figuring out the exact semantics and spotting differences in
>>>    behaviour is almost impossible.
>>>
>>> - The trouble isn't just in reading code and understanding it correctly,
>>>    it's also in finding it. If you have your own completely different world
>>>    then just finding the right code is hard - cscope and grep fail to work.
>>>
>>> - Another issue is that very often we unify semantics in drivers by adding
>>>    some new helpers that at least dtrt for most of the drivers. If you have
>>>    your own world then the impendance mismatch will make sure that amd
>>>    drivers will have slightly different semantics, and I think that's not
>>>    good for the ecosystem and kms - people want to run a lot more than just
>>>    a boot splash with generic kms userspace, stuff like xf86-video-$vendor
>>>    is going out of favour heavily.
>>>
>>> Note that all this isn't about amd walking away and leaving an
>>> unmaintainable mess behind. Like I've said I don't think this is a big
>>> risk. The trouble is that having your own world makes it harder for
>>> everyone else to understand the amd driver, and understanding all drivers
>>> is very often step 1 in some big refactoring or feature addition effort.
>>> Because starting to refactor without understanding the problem generally
>>> doesn't work ;_) And you can't make this step 1 easier for others by
>>> promising to always maintain DC and update it to all the core changes,
>>> because that's only step 2.
>>>
>>> In all the DC discussions we've had thus far I haven't seen anyone address
>>> this issue. And this isn't just an issue in drm, it's pretty much
>>> established across all linux subsystems with the "no midlayer or OS
>>> abstraction layers in drivers" rule. There's some real solid reasons why
>>> such a HAl is extremely unpopular with upstream. And I haven't yet seen
>>> any good reason why amd needs to be different, thus far it looks like a
>>> textbook case, and there's been lots of vendors in lots of subsystems who
>>> tried to push their HAL.
>>>
>>> Thanks, Daniel
>>>
>>>>      /******************************************************************
>>>>       * *** Apply safe power state
>>>>       ******************************************************************/
>>>>      pplib_apply_safe_state(core_dc);
>>>>
>>>>      /****************************************************************
>>>>       * *** Apply the context to HW (program HW)
>>>>       ****************************************************************/
>>>>      result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
>>>>      {
>>>>        /* reset pipes that need reprogramming */
>>>>        /* disable pipe power gating */
>>>>        /* set safe watermarks */
>>>>
>>>>        /* for all pipes with an attached stream */
>>>>          /************************************************************
>>>>           * *** Programming all per-pipe contexts
>>>>           ************************************************************/
>>>>          status = apply_single_controller_ctx_to_hw(...)
>>>>          {
>>>>            pipe_ctx->tg->funcs->set_blank(...);
>>>>            pipe_ctx->clock_source->funcs->program_pix_clk(...);
>>>>            pipe_ctx->tg->funcs->program_timing(...);
>>>>            pipe_ctx->mi->funcs->allocate_mem_input(...);
>>>>            pipe_ctx->tg->funcs->enable_crtc(...);
>>>>            bios_parser_crtc_source_select(...);
>>>>
>>>>            pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
>>>>            pipe_ctx->opp->funcs->opp_program_fmt(...);
>>>>
>>>>            stream->sink->link->link_enc->funcs->setup(...);
>>>>            pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
>>>>            pipe_ctx->tg->funcs->set_blank_color(...);
>>>>
>>>>            core_link_enable_stream(pipe_ctx);
>>>>            unblank_stream(pipe_ctx,
>>>>
>>>>            program_scaler(dc, pipe_ctx);
>>>>          }
>>>>        /* program audio for all pipes */
>>>>        /* update watermarks */
>>>>      }
>>>>
>>>>      program_timing_sync(core_dc, context);
>>>>      /* for all targets */
>>>>        target_enable_memory_requests(...);
>>>>
>>>>      /* Update ASIC power states */
>>>>      pplib_apply_display_requirements(...);
>>>>
>>>>      /* update surface or page flip */
>>>>    }
>>>> }
>>>>
>>>>
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel