On Thu, Jun 06, 2019 at 12:44:36PM -0700, Ralph Campbell wrote: > > On 6/6/19 7:50 AM, Jason Gunthorpe wrote: > > On Mon, May 06, 2019 at 04:29:41PM -0700, rcampbell@xxxxxxxxxx wrote: > > > From: Ralph Campbell <rcampbell@xxxxxxxxxx> > > > > > > The helper function hmm_vma_fault() calls hmm_range_register() but is > > > missing a call to hmm_range_unregister() in one of the error paths. > > > This leads to a reference count leak and ultimately a memory leak on > > > struct hmm. > > > > > > Always call hmm_range_unregister() if hmm_range_register() succeeded. > > > > > > Signed-off-by: Ralph Campbell <rcampbell@xxxxxxxxxx> > > > Signed-off-by: Jérôme Glisse <jglisse@xxxxxxxxxx> > > > Cc: John Hubbard <jhubbard@xxxxxxxxxx> > > > Cc: Ira Weiny <ira.weiny@xxxxxxxxx> > > > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > > > Cc: Arnd Bergmann <arnd@xxxxxxxx> > > > Cc: Balbir Singh <bsingharora@xxxxxxxxx> > > > Cc: Dan Carpenter <dan.carpenter@xxxxxxxxxx> > > > Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > > > Cc: Souptick Joarder <jrdr.linux@xxxxxxxxx> > > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > > include/linux/hmm.h | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > diff --git a/include/linux/hmm.h b/include/linux/hmm.h > > > index 35a429621e1e..fa0671d67269 100644 > > > +++ b/include/linux/hmm.h > > > @@ -559,6 +559,7 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block) > > > return (int)ret; > > > if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { > > > + hmm_range_unregister(range); > > > /* > > > * The mmap_sem was taken by driver we release it here and > > > * returns -EAGAIN which correspond to mmap_sem have been > > > @@ -570,13 +571,13 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block) > > > ret = hmm_range_fault(range, block); > > > if (ret <= 0) { > > > + hmm_range_unregister(range); > > > > While this seems to be a clear improvement, it seems there is still a > > bug in nouveau_svm.c around here as I see it calls hmm_vma_fault() but > > never calls hmm_range_unregister() for its on stack range - and > > hmm_vma_fault() still returns with the range registered. > > > > As hmm_vma_fault() is only used by nouveau and is marked as > > deprecated, I think we need to fix nouveau, either by dropping > > hmm_range_fault(), or by adding the missing unregister to nouveau in > > this patch. > > I will send a patch for nouveau to use hmm_range_register() and > hmm_range_fault() and do some testing with OpenCL. wow, thanks, I'd like to also really like to send such a thing through hmm.git - do you know who the nouveau maintainers are so we can collaborate on patch planning this? > I can also send a separate patch to then remove hmm_vma_fault() > but I guess that should be after AMD's changes. Let us wait to hear back from AMD how they can consume hmm.git - I'd very much like to get everything done in one kernel cycle! Regards, Jason