On 18/06/15 15:49, Daniel Vetter wrote: > On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote: >> On 17/06/15 13:05, Daniel Vetter wrote: >>> On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: >>>> Current devices may contain one or more programmable microcontrollers >>>> that need to have a firmware image (aka "binary blob") loaded from an >>>> external medium and transferred to the device's memory. >>>> >>>> This file provides generic support functions for doing this; they can >>>> then be used by each uC-specific loader, thus reducing code duplication >>>> and testing effort. >>>> >>>> Signed-off-by: Dave Gordon <david.s.gordon@xxxxxxxxx> >>>> Signed-off-by: Alex Dai <yu.dai@xxxxxxxxx> >>> >>> Given that I'm just shredding the synchronization used by the dmc loader >>> I'm not convinced this is a good idea. Abstraction has cost, and a bit of >>> copy-paste for similar sounding but slightly different things doesn't >>> sound awful to me. And the critical bit in all the firmware loading I've >>> seen thus far is in synchronizing the loading with other operations, >>> hiding that isn't a good idea. Worse if we enforce stuff like requiring >>> dev->struct_mutex. >>> -Daniel >> >> It's precisely because it's in some sense "trivial-but-tricky" that we >> should write it once, get it right, and use it everywhere. Copypaste >> /does/ sound awful; I've seen how the code this was derived from had >> already been cloned into three flavours, all different and all wrong. >> >> It's a very simple abstraction: one early call to kick things off as >> early as possible, no locking required. One late call with the >> struct_mutex held to complete the synchronisation and actually do the >> work, thus guaranteeing that the transfer to the target uC is done in a >> controlled fashion, at a time of the caller's choice, and by the >> driver's mainline thread, NOT by an asynchronous thread racing with >> other activity (which was one of the things wrong with the original >> version). > > Yeah I've seen the origins of this in the display code, and that code gets > the syncing wrong. The only thing that one has do to is grab a runtime pm > reference for the appropriate power well to prevent dc5 entry, and release > it when the firmware is loaded and initialized. Agreed. > Which means any kind of firmware loader which requires/uses > dev->struct_mutex get stuff wrong and is not appropriate everywhere. BUT, the loading of the firmware into any uC MUST be done in a controlled manner i.e. at a time when no other thread is touching the h/w. Otherwise the f/w load and whatever else is concurrently accessing the h/w could in some cases interfere disastrously. Examples of interference might be: * interleaved accesses to the ELSP (in the case of the GuC) * incorrect handover of power management (DMC, GuC) * erroneous management of forcewake state In general the f/w that is just starting on the uC may have certain expectations about the initial state of the h/w, which may not be met if other threads are accessing various bits of h/w while the uC is booting up. So we absolutely need to guarantee that the f/w load is done by a thread which has exclusive ownership of any bit of the h/w that the f/w is going to make assumptions about. With the current locking structure of the driver, that means holding the struct_mutex (it shouldn't really, there should be a separate mutex for h/w register access vs. driver-private data structures, but there isn't). >> We should convert the DMC loader to use this too, so there need be only >> one bit of code in the whole driver that needs to understand how to use >> completions to get correct handover from a free-running no-locks-held >> thread to the properly disciplined environment of driver mainline for >> purposes of programming the h/w. > > Nack on using this for dmc, since I want them to convert it to the above > synchronization, since that's how all the other async power initialization > is done. > > Guc is different since we really must have it ready for execbuf, and for > that usecase a completion at drm_open time sounds like the right thing. > > As a rule of thumb for refactoring and share infastructure we use the > following recipe in drm: > - first driver implements things as straightforward as possible > - 2nd user copypastes > - 3rd one has the duty to figure out whether some refactoring is in order > or not. > Imo that approach leads a really good balance between avoiding > overengineering and having maintainable code. > -Daniel We've already been through these phases; the code has already been cloned twice (and then changed, but not enough to fix the problems with the original) and then when I found the issues with the GuC loader and noticed the hilarious ownership dance it was doing during handover I realised it was time to fix it in one place rather than several, and posted a patchset to the internal mailing list on 2015-02-24 with this commentary: > The GuC loader uses an asynchronous thread to fetch the firmware image > (aka "binary blob") from a file and load it into the GuC's memory. > Unfortunately the GuC loading occurs *after* the internally-generated > batches used to initialise contexts have already been submitted using > direct access to the ELSP. Also, the firmware ends up being loaded at > an indeterminate time, with consequent potential for confusion in the > switchover from ELSP- to GuC-based submission. > > This patch series therefore reorganises the GuC loader to ensure that > the loading process occurs both early enough and at a well-defined > point in the sequence of operations during driver initialisation, > specifically *before* any batches are submitted to hardware. > > [PATCH 1/3] GuC: reorganise source before rewriting this code > [PATCH 2/3] GuC: load firmware image from main thread > [PATCH 3/3] GuC: update names & comments ("load" => "fetch") followed by [PATCH 0/2] unify and tidy firmware loading code on 2015-03-02. For the DMC module, the basic conversion process is to separate intel_csr_load_program() from finish_csr_load(). The latter would remain as the callback in the async thread loading process that has to validate the loaded image; the former would then become the callback for the synchronous post-handover transfer of the image to the h/w. BTW, the existing DMC loader probably won't work on Android :( .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx