On 5 June 2017 at 15:48, Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx> wrote: > From: Robert Bragg <robert@xxxxxxxxxxxxx> > > Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all > share (more-or-less) the same OA unit design. > > Of particular note in comparison to Haswell: some OA unit HW config > state has become per-context state and as a consequence it is somewhat > more complicated to manage synchronous state changes from the cpu while > there's no guarantee of what context (if any) is currently actively > running on the gpu. > > The periodic sampling frequency which can be particularly useful for > system-wide analysis (as opposed to command stream synchronised > MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to > have become per-context save and restored (while the OABUFFER > destination is still a shared, system-wide resource). > > This support for gen8+ takes care to consider a number of timing > challenges involved in synchronously updating per-context state > primarily by programming all config state from the cpu and updating all > current and saved contexts synchronously while the OA unit is still > disabled. > > The driver intentionally avoids depending on command streamer > programming to update OA state considering the lack of synchronization > between the automatic loading of OACTXCONTROL state (that includes the > periodic sampling state and enable state) on context restore and the > parsing of any general purpose BB the driver can control. I.e. this > implementation is careful to avoid the possibility of a context restore > temporarily enabling any out-of-date periodic sampling state. In > addition to the risk of transiently-out-of-date state being loaded > automatically; there are also internal HW latencies involved in the > loading of MUX configurations which would be difficult to account for > from the command streamer (and we only want to enable the unit when once > the MUX configuration is complete). > > Since the Gen8+ OA unit design no longer supports clock gating the unit > off for a single given context (which effectively stopped any progress > of counters while any other context was running) and instead supports > tagging OA reports with a context ID for filtering on the CPU, it means > we can no longer hide the system-wide progress of counters from a > non-privileged application only interested in metrics for its own > context. Although we could theoretically try and subtract the progress > of other contexts before forwarding reports via read() we aren't in a > position to filter reports captured via MI_REPORT_PERF_COUNT commands. > As a result, for Gen8+, we always require the > dev.i915.perf_stream_paranoid to be unset for any access to OA metrics > if not root. > > v5: Drain submitted requests when enabling metric set to ensure no > lite-restore erases the context image we just updated (Lionel) > > v6: In addition to drain, switch to kernel context & update all > context in place (Chris) > > v7: Add missing mutex_unlock() if switching to kernel context fails > (Matthew) > > v8: Simplify OA period/flex-eu-counters programming by using the > batchbuffer instead of modifying ctx-image (Lionel) > > v9: Back to updating the context image (due to erroneous testing, > batchbuffer programming the OA unit doesn't actually work) > (Lionel) > Pin context before updating context image (Chris) > Drop MMIO programming now that we switch to a kernel context with > right values in initial context image (Chris) > > v10: Just pin_map the contexts we want to modify or let the > configuration happen on first use (Chris) > > v11: Update kernel context OA config through the batchbuffer rather > than on the fly ctx-image update (Lionel) > > v12: Rework OA context registers update again by swithing away from > user contexts and reconfiguring the kernel context through the > batchbuffer and updating all the other contexts' context image. > Also take care to lock slice/subslice configuration when OA is > on. (Lionel) > > v13: Request rpcs updates on all engine when updating the OA config > (Lionel) > > v14: Drop any kind of rpcs management now that we monitor sseu > configuration changes in a later patch (Lionel) > Remove usleep after programming the NOA configs on Gen8+, this > doesn't seem to be needed (Lionel) > > v15: Respect coding style for block comments (Chris) > > Signed-off-by: Robert Bragg <robert@xxxxxxxxxxxxx> > Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx> > Reviewed-by: Matthew Auld <matthew.auld@xxxxxxxxx> \o/ > --- <SNIP> > diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h > index 52b3a1fd4059..eb7bb51c0a4a 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.h > +++ b/drivers/gpu/drm/i915/intel_lrc.h > @@ -62,6 +62,8 @@ enum { > INTEL_CONTEXT_SCHEDULE_OUT, > }; > > +struct sseu_dev_info; > + Seems bogus, no? _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx