On Mon, May 15, 2023 at 11:02 AM Xiao, Jack <Jack.Xiao@xxxxxxx> wrote: > > [AMD Official Use Only - General] > > > Yes, it should work. If that, the routine would behave like *reset*. That means driver would behave like to create a new queue instead of restoring a saved queue. I can have a try with this solution. > I think it should work as is. At least suspend/resume worked when I tested it. The logic in gfx_v11_0_gfx_init_queue() is: if (!amdgpu_in_reset(adev) && !adev->in_suspend) { ... // Save the original image after init at driver load time memcpy(adev->gfx.me.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); } else { // restore the saved image at resume or reset time memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], sizeof(*mqd)); ... } Alex > Regards, > Jack > ________________________________ > From: Alex Deucher <alexdeucher@xxxxxxxxx> > Sent: Monday, 15 May 2023 21:08 > To: Xiao, Jack <Jack.Xiao@xxxxxxx> > Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx> > Subject: Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM > > On Mon, May 15, 2023 at 6:40 AM Xiao, Jack <Jack.Xiao@xxxxxxx> wrote: > > > > [AMD Official Use Only - General] > > > > The MQD data in VRAM would be lost after S3, for the MQD bo is pinned down as kernel bo and can't be evicted to system memory. > > AFAIK, driver should not to do allocate/free memory during S3, as there are issues observed to do memory allocation during S3. > > We restore the contents of the MQD in gfx_v*_0_gfx_init_queue() and > gfx_v*_0_kcq_init_queue(). > > Alex > > > > > Regards, > > Jack > > > > -----Original Message----- > > From: Alex Deucher <alexdeucher@xxxxxxxxx> > > Sent: Friday, May 12, 2023 9:13 PM > > To: Xiao, Jack <Jack.Xiao@xxxxxxx> > > Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Deucher, Alexander <Alexander.Deucher@xxxxxxx> > > Subject: Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM > > > > On Fri, May 12, 2023 at 4:16 AM Jack Xiao <Jack.Xiao@xxxxxxx> wrote: > > > > > > Make the preemption optimization effect only for SRIOV, for it caused > > > failure to resume from S3. > > > > Can you elaborate? We ultimately want MQDs in VRAM for performance reasons even for bare metal. > > > > Alex > > > > > > > > Signed-off-by: Jack Xiao <Jack.Xiao@xxxxxxx> > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 ++- > > > drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 7 +++++-- > > > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 7 +++++-- > > > 3 files changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > > index a22d88a4178a..1b795b7bbf38 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > > @@ -385,7 +385,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, > > > u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > > > > > /* Only enable on gfx10 and 11 for now to avoid changing behavior on older chips */ > > > - if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0)) > > > + if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0) && > > > + amdgpu_sriov_vf(adev)) > > > domain |= AMDGPU_GEM_DOMAIN_VRAM; > > > > > > /* create MQD for KIQ */ > > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > > b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > > index 4560476c7c31..5c3d3f6c7ebd 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > > @@ -889,6 +889,7 @@ static int mes_v10_1_mqd_sw_init(struct > > > amdgpu_device *adev, { > > > int r, mqd_size = sizeof(struct v10_compute_mqd); > > > struct amdgpu_ring *ring; > > > + u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > > > > > if (pipe == AMDGPU_MES_KIQ_PIPE) > > > ring = &adev->gfx.kiq[0].ring; @@ -900,9 +901,11 @@ > > > static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev, > > > if (ring->mqd_obj) > > > return 0; > > > > > > + if (amdgpu_sriov_vf(adev)) > > > + domain |= AMDGPU_GEM_DOMAIN_VRAM; > > > + > > > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > > > - AMDGPU_GEM_DOMAIN_VRAM | > > > - AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, > > > + domain, &ring->mqd_obj, > > > &ring->mqd_gpu_addr, &ring->mqd_ptr); > > > if (r) { > > > dev_warn(adev->dev, "failed to create ring mqd bo > > > (%d)", r); diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > > index 3adb450eec07..79a4d2bfd94a 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > > @@ -987,6 +987,7 @@ static int mes_v11_0_mqd_sw_init(struct > > > amdgpu_device *adev, { > > > int r, mqd_size = sizeof(struct v11_compute_mqd); > > > struct amdgpu_ring *ring; > > > + u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > > > > > if (pipe == AMDGPU_MES_KIQ_PIPE) > > > ring = &adev->gfx.kiq[0].ring; @@ -998,9 +999,11 @@ > > > static int mes_v11_0_mqd_sw_init(struct amdgpu_device *adev, > > > if (ring->mqd_obj) > > > return 0; > > > > > > + if (amdgpu_sriov_vf(adev)) > > > + domain |= AMDGPU_GEM_DOMAIN_VRAM; > > > + > > > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > > > - AMDGPU_GEM_DOMAIN_VRAM | > > > - AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, > > > + domain, &ring->mqd_obj, > > > &ring->mqd_gpu_addr, &ring->mqd_ptr); > > > if (r) { > > > dev_warn(adev->dev, "failed to create ring mqd bo > > > (%d)", r); > > > -- > > > 2.37.3 > > >