From: Akash Goel <akash.goel@xxxxxxxxx> GuC firmware log its debug messages into a Host-GuC shared memory buffer and when the buffer is half full it sends a Flush interrupt to Host. GuC firmware follows the half-full draining protocol where it expects that while it is writing to 2nd half of the buffer, 1st half would get consumed by Host and then get a flush completed acknowledgment from Host, so that it does not end up doing any overwrite causing loss of logs. So far flush interrupt wasn't enabled on Host side & User could capture the contents/snapshot of log buffer through 'i915_guc_log_dump' debugfs iface. But this couldn't meet couple of key requirements, especially of Validation, first is to ensure capturing of all boot time logs even with high verbosity level and second is to enable capturing of logs in a sustained manner like for the entire duration of a workload. Now Driver will enable flush interrupt and on receiving it, would copy the contents of log buffer into its local buffer. The size of local buffer would be big enough to contain multiple snapshots of the log buffer giving ample time to User to pull boot time messages. Have added a debugfs interface '/sys/kernel/debug/dri/guc_log' for User to collect the logs. Availed relay framework to implement this interface, where Driver will have to just use a relay API to store snapshots of GuC log buffer in a buffer managed by relay. The relay buffer can be operated in a mode, equivalent to 'dmesg -c' where the old data, not yet collected by User, will be overwritten if buffer becomes full or it can be operated in no-overwrite mode where relay will stop accepting new data if all sub buffers are full. Have used the latter mode to avoid the possibility of getting garbled data. Besides mmap method, through which User can directly access the relay buffer contents, relay also supports the 'poll' method. Through the 'poll' call on log file, User can come to know whenever a new snapshot of the log buffer is taken by Driver, so can run in tandem with the Driver and thus capture logs in a sustained/streaming manner, without any loss of data. The logs can be captured from relay backed debugfs file through the utility igt/tools/intel_guc_logger. v2: Rebased to the latest drm-intel-nightly. v3: Aligned with the modification of late debugfs registration, at the end of i915 Driver load. Did cleanup as per Tvrtko's review comments, added 3 new patches to optimize the log-buffer flush interrupt handling, gather and report the logging related stats. v4: Added 2 new patches to further optimize the log-buffer flush interrupt handling. Did cleanup as per Chris's review comments, fixed couple of issues related to clearing of Guc2Host message register. Switched to no-overwrite mode for the relay. v5: Added a new patch to avail MOVNTDQA instruction based fast memcpy provided by a patch from Chris. Dropped the rt priority kthread patch, after evaluating all the optimizations with certain benchmarks like synmark_oglmultithread, synmark_oglbatch5 which generates flush interupts almost at every ms or less. Updated the older patches as per the review comments from Tvrtko and Chris W. Added a new patch to augment i915 error state with the GuC log buffer contents. Fixed the issue of User interrupt getting disabled for VEBOX ring, causing failure for certain IGTs. Also included 2 patches to support early logging for capturing boot time logs and use per CPU constructs on the relay side so as to address a WARNING issue with the call to relay_reserve(), without disabling preemption. v6: Mainly did the rebasing, refactoring, cleanup as per the review comments and fixed error/warnings reported by checkpatch. v7: Added a new patch to complete the pending log buffer flush work item in system suspend case. Cleaned up the irq handler & work item function by removing the check for GuC interrupts. v8: Replaced the patch added in last version with a patch which marks the GuC log buffer flush interrupt handling WQ as freezable, as per the inputs from Imre. Refactored the log buffer sampling function and added a new helper function to improve the readability as per suggestions from Tvrtko. v9: As per Chris's comment, removed the forceful flush of GuC log buffer from the error state capture path as that could have disturbed the atomicity required in error state path. Squashed the wc type vmalloc mapping patch with SSE4.1 movntdqa based memcpy patch. Added a BUG_ON for the relay buffer allocation size. v10: Mainly rebasing. Made the dedicated WQ as a high priority one. Akash Goel (12): drm/i915: New structure to contain GuC logging related fields drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set relay: Use per CPU constructs for the relay channel buffer pointers drm/i915: Add a relay backed debugfs interface for capturing GuC logs drm/i915: New lock to serialize the Host2GuC actions drm/i915: Add stats for GuC log buffer flush interrupts drm/i915: Optimization to reduce the sampling time of GuC log buffer drm/i915: Increase GuC log buffer size to reduce flush interrupts drm/i915: Augment i915 error state to include the dump of GuC log buffer drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer drm/i915: Early creation of relay channel for capturing boot time logs drm/i915: Mark the GuC log buffer flush interrupts handling WQ as freezable Sagar Arun Kamble (6): drm/i915: Decouple GuC log setup from verbosity parameter drm/i915: Add GuC ukernel logging related fields to fw interface file drm/i915: Support for GuC interrupts drm/i915: Handle log buffer flush interrupt event from GuC drm/i915: Support for forceful flush of GuC log buffer drm/i915: Debugfs support for GuC logging control drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_debugfs.c | 73 +++- drivers/gpu/drm/i915/i915_drv.c | 2 + drivers/gpu/drm/i915/i915_drv.h | 5 +- drivers/gpu/drm/i915/i915_gpu_error.c | 15 + drivers/gpu/drm/i915/i915_guc_submission.c | 597 ++++++++++++++++++++++++++++- drivers/gpu/drm/i915/i915_irq.c | 159 ++++++-- drivers/gpu/drm/i915/i915_reg.h | 11 + drivers/gpu/drm/i915/intel_drv.h | 6 + drivers/gpu/drm/i915/intel_guc.h | 30 +- drivers/gpu/drm/i915/intel_guc_fwif.h | 82 +++- drivers/gpu/drm/i915/intel_guc_loader.c | 10 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 4 +- include/linux/relay.h | 17 +- kernel/relay.c | 74 ++-- 15 files changed, 999 insertions(+), 87 deletions(-) -- 1.9.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx