Quoting Oscar Mateo (2017-10-27 19:45:39) > > > On 10/27/2017 11:30 AM, Chris Wilson wrote: > > Quoting Oscar Mateo (2017-10-27 19:01:03) > >> AubCrash is a companion to i915_gpu_error. It gives us the possibility to > >> dump an AUB file that describes the state of the system at the point of > >> the crash (GTTs, contexts, BBs, BOs, etc...). Being an AUB file, it can be > >> used by a number of already existing tools (graphical AUB file browsers, > >> simulators, emulators, etc...) that facilitate debugging (an improvement > >> over the current text-based crash dump). > > Since it is capture everything in progress, but only the kernel side of > > it, why put it in the kernel? Is this absolutely required for > > post-mortem debugging, or should we focus on capturing the death throes > > of userspace much better (an aubcapture flight-data-recorder, plus > > client annotations more akin to apitrace)? > > > > Sell me with the bugzilla references. > > -Chris > > An aubcapture flight-data-recorder is the next logical step. Like > i-g-t's intel_aubdump tool, but at the kernel level, so that it includes > everything: contexts, WA BBs, virtual GPU addresses, pagetables, etc... > The trojan horse for that is "drm/i915: Add an AUB file format writer". > Now you only have to add a couple of debugfs entries (one for start/stop > the capture, one to retrieve the AUB file as it gets created via 'relay > channel') and a number of hooks around i915 to capture everything that > can be interesting. But we don't need to do that at the kernel level, as the ioctl interface is the defining uABI. The only thing we can't snoop are the real phys addresses but afaik for the replay aspect you don't need real, just consistent. Don't do anything in the kernel that can be done in userspace, because we can never get it out again. We really do need compelling arguments as to why it is impossible to do what needs to be done from userspace. And however we put it, we can't just leak physical addressess or other lowlevel information that opens ourselves to abuse, or snooping of one client on another. At least not without a very good defense to hide behind when it is spotted. The argument has to be really compelling if you want us to maintain this for all platforms for the next decade+. One suggestion is that we put all the dodgy stuff in an auxiliary module, not even just hiding behind a module option. Of course that makes the post-mortem aspect impossible. (I'm not saying I have the answers, just that that its a high bar we have to pass.) -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx