Because this is used by a debug tool, can we use a macro to conditionally compile this feature? On 2017-08-01 10:54 AM, Christian König wrote: > Am 01.08.2017 um 16:26 schrieb Tom St Denis: >> On 01/08/17 10:10 AM, Christian König wrote: >> >>> You need to cover multiple code path here: >>> 1. The one you currently implemented which uses ttm_dma_populate() >>> and pci_map_page(). >>> 2. The one using ttm_dma_populate(). >> >> I'll have to look into this one though it's my understanding that >> code path is only followed if there's no (hardware) IOMMU right? > > Wrong, that one is used when "anything" in the system used the SWIOTLB > path once. So the detection doesn't always work reliable. > >> This one seems rather straight forward but the only catch is I don't >> have access to "adev" inside that drm function. > > When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to > get the adev. > >> Would it be taboo to do something like >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >> index 376078334f54..cd97ef2144c9 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >> @@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt >> *ttm) >> drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages, >> gtt->ttm.dma_address, ttm->num_pages); >> ttm->state = tt_unbound; >> + for (i = 0; i < ttm->num_pages; i++) { >> + trace_amdgpu_ttm_tt_populate( >> + adev, >> + gtt->ttm.dma_address[i], >> + page_to_phys(ttm->pages[i])); >> + } >> return 0; >> } >> >> This would normally result in a for loop around a nop which shouldn't >> be a huge performance hit but is one none the less. > > Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those > functions are generated by the trace subsystem automatically when you > define a trace point. > > It just doesn't use those nice tricks to modify the compiled binary on > the fly. > > Christian. > >> >>> 4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for >>> userptrs. >>> >>> Basically all just take gtt->ttm.ttm.pages and fill >>> gtt->ttm.dma_address. >> >> Same comment as #3. I can tackle this with a for loop around the >> trace if that is ok. >> >>> I suggest to add a helper you can call to trace all >>> pages->dma_address mappings inside a ttm. >> >> One thing at a time :-). That would probably be a bit better since >> the trace log gets filled with remappings (which is why my >> proof-of-concept umr patch only grabs the latest mapping as it reads >> the trace). >> >> Is there a handy place we can grab a list of ttm_tt's bound to our >> device? If so then in theory I could write a debugfs interface instead. >> >> Thanks for your patience as I'm definitely less familiar with the VM >> side of things :-) >> >> Cheers, >> Tom >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx