On 2018-04-12 04:58 PM, Andrey Grodzovsky wrote: > On 04/12/2018 10:33 AM, Michel Dänzer wrote: >> On 2018-04-12 03:33 PM, Alex Deucher wrote: >>> On Thu, Apr 12, 2018 at 7:17 AM, Andrey Grodzovsky >>> <Andrey.Grodzovsky at amd.com> wrote: >>>> On 04/12/2018 12:32 AM, Alex Deucher wrote: >>>>> On Thu, Apr 12, 2018 at 12:08 AM, Andrey Grodzovsky >>>>> <andrey.grodzovsky at amd.com> wrote: >>>>>> Reserved VRAM is used to avoid overriding pre OS FB. >>>>>> Once our display stack takes over we don't need the reserved >>>>>> VRAM anymore. >>>>>> >>>>>> v2: >>>>>> Remove comment, we know actually why we need to reserve the stolen >>>>>> VRAM. >>>>>> Fix return type for amdgpu_ttm_late_init. >>>>>> v3: >>>>>> Return 0 in amdgpu_bo_late_init, rebase on changes to previous patch >>>>>> v4: >>>>>> Don't release stolen memory for GMC9 ASICs untill GART corruption >>>>>> on S3 resume is resolved. >>>>>> >>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com> >>>>>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com> >> [...] > > Not sure what it means ? It means I trimmed some quoted text. Would it be clearer if I put quotation markers before it? >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>>>>> index 252a6c69..099e3ce5 100644 >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>>>>> @@ -659,6 +659,16 @@ static int gmc_v9_0_late_init(void *handle) >>>>>>          unsigned i; >>>>>>          int r; >>>>>> >>>>>> +      /* >>>>>> +       * TODO: >>>>>> +       * Currently there is a bug where some memory client outside >>>>>> +       * of the driver writes to first 8M of VRAM on S3 resume, >>>>>> +       * this overrides GART which by default gets placed in >>>>>> first 8M >>>>>> and >>>>>> +       * causes VM_FAULTS once GTT is accessed. >>>>>> +       * Keep the stolen memory reservation until this solved. >>>>>> +       */ >>>>>> +      /* amdgpu_bo_late_init(adev); / >>>>>> + >>>>> We still need to free this somewhere. I'd suggest calling it in >>>>> gmc_v9_0_sw_fini() and add a comment there about moving it when we fix >>>>> the issue. >>>>> >>>>>>          for(i = 0; i < adev->num_rings; ++i) { >>>>>>                  struct amdgpu_ring *ring = adev->rings[i]; >>>>>>                  unsigned vmhub = ring->funcs->vmhub; >>>>>> @@ -884,10 +894,6 @@ static int gmc_v9_0_sw_init(void *handle) >>>>>>           */ >>>>>>          adev->gmc.mc_mask = 0xffffffffffffULL; /* 48 bit MC */ >>>>>> >>>>>> -      /* >>>>>> -       * It needs to reserve 8M stolen memory for vega10 >>>>>> -       * TODO: Figure out how to avoid that... >>>>>> -       */ >>>>>>          adev->gmc.stolen_size = gmc_v9_0_get_vbios_fb_size(adev); >>>>> We may also just want to return 8MB or 9MB temporarily in >>>>> gmc_v9_0_get_vbios_fb_size until we sort out the root cause of the S3 >>>>> issue otherwise we're potentially wasting a lot more memory. >>>> But what if we have 4k display ? In this case returning 9M probably >>>> will not >>>> hide the corruption we were originally dealing with. I remember in >>>> that >>>> case pre OS FB size would be 32M. >>> I guess it's a trade off, possible garbage monentary during bios to >>> driver transition vs. wasting an additional 24 MB of CPU accessible >>> vram for the life of the driver. >> Can we free the reserved memory after initialization, then reserve it >> again on resume? > > The issue here was that someone overrides the first 8M of VRAM and > corrupts the GART table, which causes VM_FAULTS. Until we find who is > writing into this area of VRAM and when exactly I think we cannot allow > any data to be placed there since it's might get corrupted (even if we > avoid placing the GART table there). I think it shouldn't be too hard in general to make sure the GART table, and any other BOs which stay in VRAM across suspend/resume, don't fall in the affected area. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer