In v2 of the patch I used the _enabled() function around the blocks so the for loop is only reached if the trace is enabled. Tom On 01/08/17 11:56 AM, Xie, AlexBin wrote: > I don't know if compiler is smart enough to optimize the following for > statement out... > > > + for (i = 0; i < ttm->num_pages; i++) { > + trace_amdgpu_ttm_tt_populate( > + adev, > + gtt->ttm.dma_address[i], > + page_to_phys(ttm->pages[i])); > + } > > > -Alex Bin > > > ------------------------------------------------------------------------ > *From:* Christian König <deathsimple at vodafone.de> > *Sent:* Tuesday, August 1, 2017 11:41 AM > *To:* Xie, AlexBin; amd-gfx at lists.freedesktop.org > *Subject:* Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping > We can turn of the trace subsystem during compile time already. > > Additional to that the trace points use self modifying code to make sure > that they don't have any overhead as long as they are turned off. But I > don't think this works with the "if (trace_*_enabled()" as well. > > Anyway, not a performance critical code path so I don't really bother. > > Christian. > > Am 01.08.2017 um 17:00 schrieb axie: >> Because this is used by a debug tool, can we use a macro to >> conditionally compile this feature? >> >> >> On 2017-08-01 10:54 AM, Christian König wrote: >>> Am 01.08.2017 um 16:26 schrieb Tom St Denis: >>>> On 01/08/17 10:10 AM, Christian König wrote: >>>> >>>>> You need to cover multiple code path here: >>>>> 1. The one you currently implemented which uses ttm_dma_populate() >>>>> and pci_map_page(). >>>>> 2. The one using ttm_dma_populate(). >>>> >>>> I'll have to look into this one though it's my understanding that >>>> code path is only followed if there's no (hardware) IOMMU right? >>> >>> Wrong, that one is used when "anything" in the system used the >>> SWIOTLB path once. So the detection doesn't always work reliable. >>> >>>> This one seems rather straight forward but the only catch is I don't >>>> have access to "adev" inside that drm function. >>> >>> When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to >>> get the adev. >>> >>>> Would it be taboo to do something like >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >>>> index 376078334f54..cd97ef2144c9 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >>>> @@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt >>>> *ttm) >>>> drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages, >>>> gtt->ttm.dma_address, ttm->num_pages); >>>> ttm->state = tt_unbound; >>>> + for (i = 0; i < ttm->num_pages; i++) { >>>> + trace_amdgpu_ttm_tt_populate( >>>> + adev, >>>> + gtt->ttm.dma_address[i], >>>> + page_to_phys(ttm->pages[i])); >>>> + } >>>> return 0; >>>> } >>>> >>>> This would normally result in a for loop around a nop which >>>> shouldn't be a huge performance hit but is one none the less. >>> >>> Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those >>> functions are generated by the trace subsystem automatically when you >>> define a trace point. >>> >>> It just doesn't use those nice tricks to modify the compiled binary >>> on the fly. >>> >>> Christian. >>> >>>> >>>>> 4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for >>>>> userptrs. >>>>> >>>>> Basically all just take gtt->ttm.ttm.pages and fill >>>>> gtt->ttm.dma_address. >>>> >>>> Same comment as #3. I can tackle this with a for loop around the >>>> trace if that is ok. >>>> >>>>> I suggest to add a helper you can call to trace all >>>>> pages->dma_address mappings inside a ttm. >>>> >>>> One thing at a time :-). That would probably be a bit better since >>>> the trace log gets filled with remappings (which is why my >>>> proof-of-concept umr patch only grabs the latest mapping as it reads >>>> the trace). >>>> >>>> Is there a handy place we can grab a list of ttm_tt's bound to our >>>> device? If so then in theory I could write a debugfs interface >>>> instead. >>>> >>>> Thanks for your patience as I'm definitely less familiar with the VM >>>> side of things :-) >>>> >>>> Cheers, >>>> Tom >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx at lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > amd-gfx Info Page - freedesktop.org > <https://lists.freedesktop.org/mailman/listinfo/amd-gfx> > lists.freedesktop.org > Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the > following form. Use of all freedesktop.org lists is subject to our Code > of ... > > >>> >>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > amd-gfx Info Page - freedesktop.org > <https://lists.freedesktop.org/mailman/listinfo/amd-gfx> > lists.freedesktop.org > Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the > following form. Use of all freedesktop.org lists is subject to our Code > of ... > > >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > amd-gfx Info Page - freedesktop.org > <https://lists.freedesktop.org/mailman/listinfo/amd-gfx> > lists.freedesktop.org > Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the > following form. Use of all freedesktop.org lists is subject to our Code > of ... > > > > > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx >