[AMD Official Use Only - AMD Internal Distribution Only] > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx> > Sent: Thursday, November 7, 2024 5:41 > On 25/10/2024 18:41, Yunxiang Li wrote: > > Add a helper to check if the memory stats is zero, this will be used > > to check for memory accounting errors. > > > > Signed-off-by: Yunxiang Li <Yunxiang.Li@xxxxxxx> > > --- > > drivers/gpu/drm/drm_file.c | 9 +++++++++ > > include/drm/drm_file.h | 1 + > > 2 files changed, 10 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c > > index 714e42b051080..75ed701d80f74 100644 > > --- a/drivers/gpu/drm/drm_file.c > > +++ b/drivers/gpu/drm/drm_file.c > > @@ -859,6 +859,15 @@ static void print_size(struct drm_printer *p, const char > *stat, > > drm_printf(p, "drm-%s-%s:\t%llu%s\n", stat, region, sz, units[u]); > > } > > > > +int drm_memory_stats_is_zero(const struct drm_memory_stats *stats) { > > + return (stats->shared == 0 && > > + stats->private == 0 && > > + stats->resident == 0 && > > + stats->purgeable == 0 && > > + stats->active == 0); > > +} > > Could use mem_is_zero() for some value of source/binary compactness. Yeah, the patch set started out with that when it's just a function in amdgpu, but Christ didn't like it. > > +EXPORT_SYMBOL(drm_memory_stats_is_zero); > > + > > I am not a huge fan of adding this as an interface as the only caller appears to be a > sanity check in amdgpu_vm_fini(): > > if (!amdgpu_vm_stats_is_zero(vm)) > dev_err(adev->dev, "VM memory stats is non-zero when fini\n"); > > But I guess there is some value in sanity checking since amdgpu does not have a > notion of debug only code (compiled at production and exercised via a test suite). > > I do suggest to demote the dev_err to notice log level would suffice and be more > accurate. I think it's very important to have a check like this when we have a known invariant, especially in this case where there's stat tracking code spread out everywhere and we have very little chance of catching a bug right when it happened. And since whenever this check fails we know for sure there is a bug, I don't see the harm of keeping it as an error. Now that I think about it, I probably want to have the process & task name in here to aid in reproduction. Teddy