This is new i915 VGPU PV design based on the last year proposal [1]. This is a new series of patch set and discontiued the old series of patch set due to this new design. To improve vgpu performance, it could implement some PV optimizations in different gpu resource domain to reduce the data access overhead or complexity modeling. In this patch set, PPGTT and GGTT are identifed as PV optimization from VGPU memory resource point of view and workloa submisison is identifed as PV optimization from VGPU compute resource point of view. so 3 PV features (PV PPGTT, PV GGTT and PV submisison) are designed and implemented to support VGPU model better. To provide the mechanism for PV feature development and implementation, A simple PV framework is implemented and consisted of 3 sub items: a. PV capability: it indicateds what kind of PV capability provided by both guest system and host GVTg subsystem. b. PV shared memory: this memory is allocated in guest and shared between guest and host for data exchange, PV command & PV data communication. c. PV command transport protocol: on top of PV shared memory, it defines the communication protocol & channel between guest and host to circulate PV command and PV command data. for PV PPGTT, to improve efficiency and reduce the complexity of ppgtt support, vgpu ppgtt page table operations are implemented in pv fashion with pv version of bind/unbind for ppgtt vma ops. The pv version of ppgtt vma ops use the CTB protocol to communicate pv ppgtt command along with data struct pv_vma from guest to GVT and then GVT implement command handler of PV_CMD_BIND_PPGTT and PV_CMD_UBIND_PPGTT to achieve GVA->HPA address translation. for PV GGTT, it is similar with PV PPGGT instead to use PV_CMD_BIND_GGTT and PV_CMD_UNBIND_GGTT pv command. for PV workload submisison, a pv version of workload submission backend implemented with engine submission data in the shared memory and meanwhile eliminating execlists csb process and context switch interrupt in submisision routine to improve efficiency and reduce complexity. Based on the experiment, small workloads such as glmark2 and Antutu 3D benchmark can get benefit for these PV featuers at least 10% performance gain. for large workload such as media and 3D, it get some benefit, but not much. [1]: https://patchwork.kernel.org/cover/11148059/ Xiaolin Zhang (12): drm/i915: introduced vgpu pv capability drm/i915: vgpu shared memory setup for pv support drm/i915: vgpu pv command buffer transport protocol drm/i915: vgpu ppgtt page table pv support drm/i915: vgpu ggtt page table pv support drm/i915: vgpu workload submisison pv support drm/i915/gvt: GVTg expose pv_caps PVINFO register drm/i915/gvt: GVTg handle guest shared_page setup drm/i915/gvt: GVTg support vgpu pv CTB protocol drm/i915/gvt: GVTg support ppgtt pv operations drm/i915/gvt: GVTg support ggtt pv operations drm/i915/gvt: GVTg support pv workload submssion drivers/gpu/drm/i915/Makefile | 2 +- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 4 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 2 + drivers/gpu/drm/i915/gvt/gtt.c | 255 ++++++++++++++ drivers/gpu/drm/i915/gvt/gtt.h | 4 + drivers/gpu/drm/i915/gvt/gvt.h | 17 +- drivers/gpu/drm/i915/gvt/handlers.c | 274 ++++++++++++++- drivers/gpu/drm/i915/gvt/vgpu.c | 47 +++ drivers/gpu/drm/i915/i915_debugfs.c | 3 + drivers/gpu/drm/i915/i915_drv.c | 2 + drivers/gpu/drm/i915/i915_drv.h | 5 +- drivers/gpu/drm/i915/i915_gem.c | 4 +- drivers/gpu/drm/i915/i915_pvinfo.h | 9 +- drivers/gpu/drm/i915/i915_vgpu.c | 533 ++++++++++++++++++++++++++++- drivers/gpu/drm/i915/i915_vgpu.h | 122 +++++++ drivers/gpu/drm/i915/intel_pv_submission.c | 324 ++++++++++++++++++ 16 files changed, 1599 insertions(+), 8 deletions(-) create mode 100644 drivers/gpu/drm/i915/intel_pv_submission.c -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx