[AMD Official Use Only - AMD Internal Distribution Only] > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx> > Sent: Wednesday, November 13, 2024 12:31 > On 13/11/2024 17:01, Li, Yunxiang (Teddy) wrote: > > [Public] > > > >> From: Koenig, Christian <Christian.Koenig@xxxxxxx> > >> Sent: Wednesday, November 13, 2024 9:22 Am 13.11.24 um 14:53 schrieb > >> Li, Yunxiang (Teddy): > >>> [Public] > >>> > >>>> From: Koenig, Christian <Christian.Koenig@xxxxxxx> > >>>> Sent: Wednesday, November 13, 2024 6:39 Am 13.11.24 um 11:25 > >>>> schrieb Tvrtko Ursulin: > >>>>> On 13/11/2024 08:49, Christian König wrote: > >>>>>> Am 12.11.24 um 19:16 schrieb Li, Yunxiang (Teddy): > >>>>>>> [SNIP] > >>>>>>>>> + size = sign * amdgpu_bo_size(bo); > >>>>>>>>> + res = bo->tbo.resource; > >>>>>>>>> + type = res ? res->mem_type : > >>>>>>>>> amdgpu_bo_get_preferred_placement(bo); > >>>>>>>> Again, it's a clear NAK from my side to do stuff like that. > >>>>>>>> > >>>>>>>> When there isn't any backing store the BO should *not* be > >>>>>>>> accounted to anything. > >>>>>>> I don't have a preference either way, but I think it should be a > >>>>>>> separate discussion to properly define what drm-total- means. > >>>>> Total must show the total size of all BOs which exist even if they > >>>>> don't currently have a backing store. That's how > >>>>> drm-usage-stats.rst defines the field and that is how all the other drivers > work. > >>>> In that case we should only look at the preferred placement and not > >>>> the backing store at all. > >>>> > >>>> But that makes the total identical to the requested value, doesn't it? > >>> Yes, the issue is not which BO needs to be counted but where they > >>> should be > >> counted. This gets more complicated if we consider BOs to prefer > >> multiple placements. > >>> > >>> IMO it makes sense to have drm-total- to work like the legacy > >>> amd-requested- > >> where we look at BO's preferred placement. For multiple preferred > >> placements we say that the implementation needs to pick one of them > >> to avoid double counting, but which one is up to the implementation as long as > it's done in a consistent manner. > >> Does that sound reasonable? > >> > >> Yeah that works for me. Just don't look at both > >> bo->preferred_placement and bo- > >>> tbo.resource because that will certainly be inconsistent in some use cases. > > > > oof, from the commit message i915/xe is doing the exact opposite, BO gets > counted in the totals for all the possible(preferred?) regions. > > Which commit message? I was doing that early during i915 patch development but > stopped in v2: > > commit 968853033d8aa4dbb80fbafa6f5d9b6a0ea21272 > Author: Tvrtko Ursulin <tursulin@xxxxxxxxxxx> > Date: Tue Nov 7 10:18:06 2023 +0000 > > drm/i915: Implement fdinfo memory stats printing > > Use the newly added drm_print_memory_stats helper to show memory > utilisation of our objects in drm/driver specific fdinfo output. > > To collect the stats we walk the per memory regions object lists > and accumulate object size into the respective drm_memory_stats > categories. > > v2: > * Only account against the active region. > > ^^^ THIS ^^^ > > * Use DMA_RESV_USAGE_BOOKKEEP when testing for active. (Tejas) > > v3: > * Update commit text. (Aravind) > * Update to use memory regions uabi names. > > In code that would be here: > > static void > obj_meminfo(struct drm_i915_gem_object *obj, > struct drm_memory_stats stats[INTEL_REGION_UNKNOWN]) { > const enum intel_region_id id = obj->mm.region ? > obj->mm.region->id : > INTEL_REGION_SMEM; > > So either active region or SMEM if no backing store. Maybe that should be > improved too. Grr (to myself). > > I don't see xe is counting total against all regions either, apart that maybe it has > potential null ptr deref? > > static void bo_meminfo(struct xe_bo *bo, > struct drm_memory_stats stats[TTM_NUM_MEM_TYPES]) { > u64 sz = bo->size; > u32 mem_type = bo->ttm.resource->mem_type; > > Or is bo->ttm.resource always present in xe? Adding Matt according to git blame. > > Regards, > > Tvrtko Thanks for the background, yeah it was mentioned in the xe commit, and the i915 changes looked very similar so I thought I should probably mention here. Now I dug around the mailing list and seems there was some confusion and the wrong commit message got merged. https://lists.freedesktop.org/archives/intel-xe/2023-August/010885.html