Since we cannot ensure VRAM is consistent after a GPU reset, page table shadowing is necessary. Shadowed page tables are, in a sense, a method to recover the consistent state of the page tables before the reset occurred. We need to allocate GTT bo as the shadow of VRAM bo when creating page table, and make them the same. After gpu reset, we will need to use SDMA to copy GTT bo content to VRAM bo, then page table will be recoveried. V2: Shadow bo uses a shadow entity running on normal run queue, after gpu reset, we need to wait for all shadow jobs finished first, then recovery page table from shadow. V3: Addressed Christian comments for shadow bo part. Chunming Zhou (18): drm/amdgpu: add shadow bo support V2 drm/amdgpu: validate shadow as well when validating bo drm/amdgpu: allocate shadow for pd/pt bo V2 drm/amdgpu: add shadow flag V2 drm/amdgpu: sync bo and shadow drm/amdgpu: implement vm recovery function from shadow drm/amdgpu: add shadow_entity for shadow page table updates drm/amdgpu: update pd shadow bo drm/amdgpu: update pt shadow drm/amd: add last fence in sched entity drm/amdgpu: link all vm clients drm/amdgpu: add vm_list_lock drm/amd: add block entity function drm/amdgpu: add shadow fence owner drm/amd: block entity drm/amdgpu: recover page tables after gpu reset drm/amdgpu: add need backup function drm/amdgpu: add backup condition for vm drivers/gpu/drm/amd/amdgpu/amdgpu.h | 25 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 82 +++++--- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 88 +++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 100 ++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 281 +++++++++++++++++++------- drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 38 +++- drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 5 + include/uapi/drm/amdgpu_drm.h | 2 + 10 files changed, 524 insertions(+), 105 deletions(-) -- 1.9.1