This is a very cool patch series. I made some specific comments on some of the patches. But overall this is great. I guess your plan is to start testing the SVM programming model with ATC, and then enable the same programming model without the IOMMU using HMM. That means there could be working and validated tests to run on HMM. I think 4.16 will get an IOMMUv2 driver that we worked on for Raven to support PPR on newer IOMMU versions. Without that the GPU is not able to make swapped or new pages resident or trigger a COW. We're currently still working on one problem with Raven, related to the way GFX9 retries memory accesses. Many PPR requests for the same virtual address can be outstanding (in an IOMMU log buffer). After the first request is handled, the GPU can continue, but the remaining requests are still in the queue. This can result in the IOMMU driver trying to handle a PPR for a page that's already freed by the application, which triggers an invalid PPR callback. An invalid PPR is like the GPU-equivalent of a segfault, and KFD implements it like that. With the above behaviour we end up segfaulting applications that didn't do anything wrong. I guess for your implementation it's not a problem because you don't implement that callback yet. Regards,  Felix On 2018-01-26 03:13 PM, Christian König wrote: > That got mixed up with the encode ring function. > > Signed-off-by: Christian König <christian.koenig at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c > index 44c041a1fe68..24ebc3e296a6 100644 > --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c > @@ -880,6 +880,22 @@ static void vcn_v1_0_dec_ring_emit_vm_flush(struct amdgpu_ring *ring, > vcn_v1_0_dec_vm_reg_wait(ring, data0, data1, mask); > } > > +static void vcn_v1_0_dec_ring_emit_wreg(struct amdgpu_ring *ring, > + uint32_t reg, uint32_t val) > +{ > + struct amdgpu_device *adev = ring->adev; > + > + amdgpu_ring_write(ring, > + PACKET0(SOC15_REG_OFFSET(UVD, 0, mmUVD_GPCOM_VCPU_DATA0), 0)); > + amdgpu_ring_write(ring, reg << 2); > + amdgpu_ring_write(ring, > + PACKET0(SOC15_REG_OFFSET(UVD, 0, mmUVD_GPCOM_VCPU_DATA1), 0)); > + amdgpu_ring_write(ring, val); > + amdgpu_ring_write(ring, > + PACKET0(SOC15_REG_OFFSET(UVD, 0, mmUVD_GPCOM_VCPU_CMD), 0)); > + amdgpu_ring_write(ring, VCN_DEC_CMD_WRITE_REG << 1); > +} > + > /** > * vcn_v1_0_enc_ring_get_rptr - get enc read pointer > * > @@ -1097,7 +1113,7 @@ static const struct amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = { > .pad_ib = amdgpu_ring_generic_pad_ib, > .begin_use = amdgpu_vcn_ring_begin_use, > .end_use = amdgpu_vcn_ring_end_use, > - .emit_wreg = vcn_v1_0_enc_ring_emit_wreg, > + .emit_wreg = vcn_v1_0_dec_ring_emit_wreg, > }; > > static const struct amdgpu_ring_funcs vcn_v1_0_enc_ring_vm_funcs = { > @@ -1124,6 +1140,7 @@ static const struct amdgpu_ring_funcs vcn_v1_0_enc_ring_vm_funcs = { > .pad_ib = amdgpu_ring_generic_pad_ib, > .begin_use = amdgpu_vcn_ring_begin_use, > .end_use = amdgpu_vcn_ring_end_use, > + .emit_wreg = vcn_v1_0_enc_ring_emit_wreg, > }; > > static void vcn_v1_0_set_dec_ring_funcs(struct amdgpu_device *adev)