From: Brad Volkin <bradley.d.volkin@xxxxxxxxx> Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in "unsecure" batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device. This series introduces a software command parser similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop'd by hardware, if the parser determines the operation is safe, and submits the batch as "secure" to prevent hardware parsing. Currently the series implements this on IVB and HSW. The series is divided into several phases: patches 01-09: These implement infrastructure and the command parsing algorithm, all behind a module parameter. I expect some discussion and rework, but hopefully there's nothing too controversial. patches 10-17: These define the checks performed by the parser. I expect much discussion :) patches 18-20: In a final pass over the command checks, I found some issues with the definitions. They looked painful to rebase in, so I've added them here. patches 21-22: These enable the parser by default. It runs on all batches except those that set the I915_EXEC_SECURE flag in the execbuffer2 call. There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very basic and do not test all of the commands used by the parser on the assumption that I'm likely to make the same mistakes in both the parser and the test. I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a few days ago), and generally used an Ubuntu 13.10 IVB system with the parser running. Aside from a failure described below, I don't think there are any regressions. That is, piglit claims some regressions, but from manually running the tests I think these are false positives. However, I could use help in getting broader testing, particularly around performance. In general, I see less than 3% performance impact on HSW, with more like 10% impact for pathological batch sizes. But we'll certainly want to run relevant benchmarks beyond what I've done. At this point there are a couple of known issues and potential improvements. 1) VLV. The parser is currently disabled for VLV. One type of check performed by the parser is that commands which access memory do so via PPGTT. VLV does not have PPGTT enabled at this time. I chose to implement the PPGTT checks via generic bit checking infrastructure in the parser, so they are not easily disabled for VLV. For now, I'm disabling parsing altogether in the hope that PPGTT can be enabled for VLV in the near future. 2) Coherency. I've found two types of coherency issues when reading the batch buffer from the CPU during execbuffer2. Looking for help with both issues. i. First, the i-g-t test gem_cpu_reloc blits to a batch buffer and the parser isn't properly waiting for the operation to complete before parsing. I tried adding i915_gem_object_sync(batch_obj, [ring|NULL]) but that actually caused more failures. ii. Second, on VLV, I've seen cache coherency issues when userspace writes the batch via pwrite fast path before calling execbuffer2. The parser reads stale data. This works fine on IVB and HSW, so I believe it's an LLC vs. non-LLC issue. I'm just unclear on what the correct flushing or synchronization is for this scenario. 3) 2nd-level batches. The parser currently allows MI_BATCH_BUFFER_START commands in userspace batches without parsing them. The media driver uses 2nd-level batches, so a solution is required. I have some ideas but don't want to delay the review process for what I have so far. It may be that the 2nd-level parsing is only needed for VCS and the current code (plus rejecting BBS) would be sufficient for RCS. 4) Command buffer copy. To avoid CPU modifications to buffers after parsing, and to avoid GPU modifications to buffers via EUs or commands in the batch, we should copy the userspace batch buffer to memory that userspace does not have access to, map it into GGTT, and execute that batch buffer. I have a sense of how to do this for 1st-level batches, but it would need changes to tie in with the 2nd-level batch parsing I think, so I've again held off. Brad Volkin (22): drm/i915: Add data structures for command parser drm/i915: Initial command parser table definitions drm/i915: Hook command parser tables up to rings drm/i915: Add per-ring command length decode functions drm/i915: Implement command parsing drm/i915: Add a HAS_CMD_PARSER getparam drm/i915: Add support for rejecting commands during parsing drm/i915: Add support for checking register accesses drm/i915: Add support for rejecting commands via bitmasks drm/i915: Reject unsafe commands drm/i915: Add register whitelists for mesa drm/i915: Enable register whitelist checks drm/i915: Enable bit checking for some commands drm/i915: Enable PPGTT command parser checks drm/i915: Reject commands that would store to global HWS page drm/i915: Reject additional commands drm/i915: Add parser data for perf monitoring GL extensions drm/i915: Reject MI_ARB_ON_OFF on VECS drm/i915: Fix length handling for MFX_WAIT drm/i915: Fix MI_STORE_DWORD_IMM parser defintion drm/i915: Clean up command parser enable decision drm/i915: Enable command parsing by default drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/i915_cmd_parser.c | 712 +++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_dma.c | 3 + drivers/gpu/drm/i915/i915_drv.c | 5 + drivers/gpu/drm/i915/i915_drv.h | 96 ++++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 + drivers/gpu/drm/i915/i915_reg.h | 66 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 2 + drivers/gpu/drm/i915/intel_ringbuffer.h | 25 + include/uapi/drm/i915_drm.h | 1 + 10 files changed, 927 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/i915_cmd_parser.c -- 1.8.4.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx