On 15/08/17 08:11 AM, Christian König wrote: > Am 15.08.2017 um 13:33 schrieb Tom St Denis: >> On 15/08/17 07:22 AM, Christian König wrote: >>> Am 14.08.2017 um 13:37 schrieb Tom St Denis: >>>> This patch adds trace points to the amdgpu bo_kmap/bo/kunmap functions >>>> which capture internal allocations. >>>> >>>> (v2): Simply add a sleep to the init. Users can use this script to >>>> load with map debugging enabled: >>>> >>>> #!/bin/bash >>>> modprobe amdgpu map_debug=1 & >>>> sleep 1 >>>> echo 1 > >>>> /sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_unpopulate/enable >>>> echo 1 > >>>> /sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_populate/enable >>> >>> I've just realized that there is a far simpler method than that or >>> enabling trace points using the module command line. >>> >>> Assuming your GPU is PCI device 01:00.0 connected through bridge >>> 00:02.1 (use lspci -t to figure the connection out): >>> >>> # Temporary remove the PCIe device from the bus >>> echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/remove >>> # Load amdgpu, this allows you to enable the trace points you want >>> and also sets probes etc.. >>> modprobe amdpgu >>> # Rescan the bus, the device subsystem will automatically probe >>> amdgpu with this device >>> echo 1 > /sys/bus/pci/devices/0000\:00\:02.1/rescan >> >> That would definitely be a bunch trickier to script up. Nobody will >> run this by hand. So we'd need to remove all devices that are AMD >> vendor ID but GPU only (e.g. not other controllers) which means we >> need to parse the lspci output for VGA controller related text. >> >> Definitely doable and probably "cleaner" from a race point of view. >> >>> Apart from that I would really like to see those trace points in TTM >>> instead. I also don't mind adding a dev or pdev pointer to the >>> ttm_bo_device structure for this. >> >> Not against this in principle. There's a bit of confusion... our ttm >> amdgpu functions are passed >> >> struct ttm_tt *ttm >> >> which we cast to >> >> struct amdgpu_ttm_tt { >> struct ttm_dma_tt ttm; >> struct amdgpu_device *adev; > > BTW: Isn't this adev pointer here superflous? I mean the ttm_tt has a > bdev pointer to use, don't they? Honestly without cscoping it I wouldn't know :-) > >> u64 offset; >> ... <snip> >> >> Then we use the "->ttm" of that to get dma_address[] but pages[] from >> the original ttm_tt structure ... from what I recall (when I looked >> into this last week) ttm_dma_tt and ttm_tt aren't the same structure >> right? (and looking at it right now ttm_dma_tt begins with ttm_tt....). >> >> So wherever I put the trace in ttm I need access to dma_address[] >> which doesn't seem so clear cut. Can I just cast a ttm_tt pointer to >> ttm_dma_tt? > > Ah, ok now I understand your problem. Well, printing the dma addresses > in this situation is probably not a good idea to start with. > > When a buffer object is mapped to kernel space there is no guarantee > that it already has it's DMA mapping. For amdgpu we probably never run > into this situation, but for other drivers that might now be the case. At the points I'm putting the traces a "map" has already happened per page like in ttm_populate and our kmap functions. > Do we really need that info? To reverse the IOMMU translation I need the bus/dma address, physical address, and PCI device information (because the dma <=> phys translation is only valid per PCI device). Tom