-----Original Message----- From: Auld, Matthew <matthew.auld@xxxxxxxxx> Sent: Tuesday, February 21, 2023 9:46 AM To: Cavitt, Jonathan <jonathan.cavitt@xxxxxxxxx> Cc: Dutt, Sudeep <sudeep.dutt@xxxxxxxxx>; Siddiqui, Ayaz A <ayaz.siddiqui@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > > On 21/02/2023 17:14, Cavitt, Jonathan wrote: > > -----Original Message----- > > From: Auld, Matthew <matthew.auld@xxxxxxxxx> > > Sent: Tuesday, February 21, 2023 8:33 AM > > To: Cavitt, Jonathan <jonathan.cavitt@xxxxxxxxx> > > Cc: Dutt, Sudeep <sudeep.dutt@xxxxxxxxx>; Siddiqui, Ayaz A <ayaz.siddiqui@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > > Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > >> > >> On 21/02/2023 16:28, Cavitt, Jonathan wrote: > >>> -----Original Message----- > >>> From: Auld, Matthew <matthew.auld@xxxxxxxxx> > >>> Sent: Tuesday, February 21, 2023 8:06 AM > >>> To: Cavitt, Jonathan <jonathan.cavitt@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >>> Cc: Dutt, Sudeep <sudeep.dutt@xxxxxxxxx>; Siddiqui, Ayaz A <ayaz.siddiqui@xxxxxxxxx> > >>> Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > >>>> > >>>> On 17/02/2023 19:18, Jonathan Cavitt wrote: > >>>>> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because > >>>>> MTL reports as not supporting 64K pages, or more accurately, the system that reports > >>>>> whether a platform has 64K pages reports false for MTL. This is only half correct, > >>>>> as the 64K page support reporting system only cares about 64K page support for LMEM, > >>>>> which MTL doesn't have. > >>>>> > >>>>> MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to > >>>>> using that manager doesn't resolve the issue because MTL is expecting the virtual > >>>>> address space for the page table to be flushed after initialization, so we must also > >>>>> add a flush statement there. > >>>>> > >>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx> > >>>> Reviewed-by: Matthew Auld <matthew.auld@xxxxxxxxx> > >>>> > >>>> Although it looks like the hugepage mock tests are failing with this. I > >>>> assume the mock device just uses some "max" gen version or so, which now > >>>> triggers this path. Any ideas for that? > >>> > >>> With this patch applied, multiple calls to the hugepages live selftest result in a kernel panic. > >>> If the mock tests are run immediately after the live ones, that would explain this behavior. > >>> I was informed when this was initially debugged that the error was a known IOMMU issue > >>> rather than some novel regression, though it's hard to tell if that was just hopeful optimism > >>> or not at this point. > >> > >> In the test results we now get: > >> > >> 6> [183.420316] i915: Running > >> i915_gem_huge_page_mock_selftests/igt_mock_exhaust_device_supported_pages > >> <6> [183.436978] i915: Running > >> i915_gem_huge_page_mock_selftests/igt_mock_memory_region_huge_pages > >> <6> [183.445777] i915: Running > >> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_misaligned_dma > >> <6> [183.904531] i915: Running > >> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_huge_fill > >> <3> [183.912658] gtt=69632, expected=4096, size=69632, single=yes > >> <3> [183.912784] i915/i915_gem_huge_page_mock_selftests: > >> igt_mock_ppgtt_huge_fill failed with error -22 > > > > if (expected_gtt & I915_GTT_PAGE_SIZE_4K) > > expected_gtt &= ~I915_GTT_PAGE_SIZE_64K; > > > > I don't know why we're doing that to expected_gtt, but that seems to be the cause of the > > problem in this case. > > I think it's due to the older huge page model, where 64K requires the > entire page-table to all use 64K pages underneath (pde level hint), so > if we see 4K in there somewhere then we don't expect to get back 64K > GTT. But on newer HW we now have have pte level hint, so I think the > above can just be removed with this patch, since that's what the mock > device now uses. Seems right. I guess that would be... what? Is it: A. Platform specific? I.E. we need s generation check in the selftest to proceed, such as the following: if (expected_gtt & I915_GTT_PAGE_SIZE_4K && GRAPHICS_VER(i915) >= 12) B. Systems specific? I.E. we have a special check for this functionality such as: if (expected_gtt & I915_GTT_PAGE_SIZE_4K && has_pte_level_hint(i915)) C. The new norm. I.E. we can just remove this line from the test and everything will work out fine. -Jonathan Cavitt > > > -Jonathan Cavitt > > > >> > >> I didn't look any deeper than that though. Note that this a just a > >> mock/fake device. I don't think its IOMMU related. > >> > >>> -Jonathan Cavitt > >>> > >>>> > >>>>> --- > >>>>> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- > >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>>>> > >>>>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>>>> index 4daaa6f55668..9c571185395f 100644 > >>>>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>>>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>>>> @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, > >>>>> } > >>>>> } while (rem >= page_size && index < max); > >>>>> > >>>>> + drm_clflush_virt_range(vaddr, PAGE_SIZE); > >>>>> vma_res->page_sizes_gtt |= page_size; > >>>>> } while (iter->sg && sg_dma_len(iter->sg)); > >>>>> } > >>>>> @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, > >>>>> struct sgt_dma iter = sgt_dma(vma_res); > >>>>> > >>>>> if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { > >>>>> - if (HAS_64K_PAGES(vm->i915)) > >>>>> + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) > >>>>> xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); > >>>>> else > >>>>> gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); > >>>> > >> >