[PATCH 01/13] drm/amdgpu: fix vcn_v1_0_dec_ring_emit_wreg

felix.kuehling@xxxxxxx (Felix Kuehling) · Mon, 29 Jan 2018 18:01:41 -0500

This is a very cool patch series. I made some specific comments on some
of the patches. But overall this is great.

I guess your plan is to start testing the SVM programming model with
ATC, and then enable the same programming model without the IOMMU using
HMM. That means there could be working and validated tests to run on HMM.

I think 4.16 will get an IOMMUv2 driver that we worked on for Raven to
support PPR on newer IOMMU versions. Without that the GPU is not able to
make swapped or new pages resident or trigger a COW.

We're currently still working on one problem with Raven, related to the
way GFX9 retries memory accesses. Many PPR requests for the same virtual
address can be outstanding (in an IOMMU log buffer). After the first
request is handled, the GPU can continue, but the remaining requests are
still in the queue. This can result in the IOMMU driver trying to handle
a PPR for a page that's already freed by the application, which triggers
an invalid PPR callback.

An invalid PPR is like the GPU-equivalent of a segfault, and KFD
implements it like that. With the above behaviour we end up segfaulting
applications that didn't do anything wrong. I guess for your
implementation it's not a problem because you don't implement that
callback yet.

Regards,
Â  Felix

On 2018-01-26 03:13 PM, Christian KÃ¶nig wrote:
> That got mixed up with the encode ring function.
>
> Signed-off-by: Christian KÃ¶nig <christian.koenig at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> index 44c041a1fe68..24ebc3e296a6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> @@ -880,6 +880,22 @@ static void vcn_v1_0_dec_ring_emit_vm_flush(struct amdgpu_ring *ring,
>  	vcn_v1_0_dec_vm_reg_wait(ring, data0, data1, mask);
>  }
>  
> +static void vcn_v1_0_dec_ring_emit_wreg(struct amdgpu_ring *ring,
> +					uint32_t reg, uint32_t val)
> +{
> +	struct amdgpu_device *adev = ring->adev;
> +
> +	amdgpu_ring_write(ring,
> +		PACKET0(SOC15_REG_OFFSET(UVD, 0, mmUVD_GPCOM_VCPU_DATA0), 0));
> +	amdgpu_ring_write(ring, reg << 2);
> +	amdgpu_ring_write(ring,
> +		PACKET0(SOC15_REG_OFFSET(UVD, 0, mmUVD_GPCOM_VCPU_DATA1), 0));
> +	amdgpu_ring_write(ring, val);
> +	amdgpu_ring_write(ring,
> +		PACKET0(SOC15_REG_OFFSET(UVD, 0, mmUVD_GPCOM_VCPU_CMD), 0));
> +	amdgpu_ring_write(ring, VCN_DEC_CMD_WRITE_REG << 1);
> +}
> +
>  /**
>   * vcn_v1_0_enc_ring_get_rptr - get enc read pointer
>   *
> @@ -1097,7 +1113,7 @@ static const struct amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = {
>  	.pad_ib = amdgpu_ring_generic_pad_ib,
>  	.begin_use = amdgpu_vcn_ring_begin_use,
>  	.end_use = amdgpu_vcn_ring_end_use,
> -	.emit_wreg = vcn_v1_0_enc_ring_emit_wreg,
> +	.emit_wreg = vcn_v1_0_dec_ring_emit_wreg,
>  };
>  
>  static const struct amdgpu_ring_funcs vcn_v1_0_enc_ring_vm_funcs = {
> @@ -1124,6 +1140,7 @@ static const struct amdgpu_ring_funcs vcn_v1_0_enc_ring_vm_funcs = {
>  	.pad_ib = amdgpu_ring_generic_pad_ib,
>  	.begin_use = amdgpu_vcn_ring_begin_use,
>  	.end_use = amdgpu_vcn_ring_end_use,
> +	.emit_wreg = vcn_v1_0_enc_ring_emit_wreg,
>  };
>  
>  static void vcn_v1_0_set_dec_ring_funcs(struct amdgpu_device *adev)