On Wed, Aug 5, 2015 at 6:55 AM, <sourab.gupta@xxxxxxxxx> wrote: > From: Sourab Gupta <sourab.gupta@xxxxxxxxx> > > This patch adds support for retrieving MMIO register values alongwith > timestamps and forwarding them to userspace through perf. > The userspace can request upto 8 MMIO register values to be dumped. > The addresses of upto 8 MMIO registers can be passed through perf attr > config. The registers are checked against a whitelist before passing them > on. The commands to dump the values of these MMIO registers are then > inserted into the ring alongwith commands to dump the timestamps. Considering the discussion had so far with Peter: one thing raised was a preference for exposing individual counters via separate events. In the case of OA metrics I don't think that's at all as straight forward as it sounds due to the way the OA unit is configured and reports counters but for mmio based counters the configurations are completely orthogonal (just an address) so I don't know that there's a need to configure multiple reads per event and I imagine we should be able to avoid the arbitrary limit of 8 reads. Perf allows users to group event fds together which signifies to the kernel that it wants the counters to be reported in the same buffer (the buffer of the group leader). A more extensible list of registers that should be read via the SRM commands could be indirectly derived by maintaining a list of the active mmio-read events. I think something else to raise here is that it could help if we had some more concrete use cases and at least some prototype userspace code for this interface. I guess the requirements around privileges could depend a bit on what specific registers you're interested in. If security requirements may vary for different counters I do also wonder if instead of a generic mmio event it might be appropriate to enumerate what we're interested in and have a separate event for each specific counter considering requirements on a case by case basis. I wonder if we should also consider exposing 64bit counters such as the pipeline statistics here. intel_gpu_top tries to expose pipeline statistics but one problem if faces is that these are per-context counters so it would be better to read them via the command stream with a mechanism like this instead of periodically so that the reads can be reliably mapped to a context. In general a mechanism like this could be a good fit for exposing per-context metrics to a system compositor (metrics not well suited to period sampling). - Robert _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx