RE: [EXT] Re: [PATCH v11 rdma-next 6/7] RDMA/qedr: Add doorbell overflow recovery support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Jason Gunthorpe <jgg@xxxxxxxx>
> Sent: Friday, September 20, 2019 4:38 PM
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Fri, Sep 20, 2019 at 04:30:52PM +0300, Gal Pressman wrote:
> > On 19/09/2019 21:02, Jason Gunthorpe wrote:
> > > On Thu, Sep 05, 2019 at 01:01:16PM +0300, Michal Kalderon wrote:
> > >
> > >> @@ -347,6 +360,9 @@ void qedr_mmap_free(struct
> rdma_user_mmap_entry
> > >> *rdma_entry)  {
> > >>  	struct qedr_user_mmap_entry *entry =
> > >> get_qedr_mmap_entry(rdma_entry);
> > >>
> > >> +	if (entry->mmap_flag == QEDR_USER_MMAP_PHYS_PAGE)
> > >> +		free_page((unsigned long)phys_to_virt(entry->address));
> > >> +
> > >
> > > While it isn't wrong it do it this way, we don't need this
> > > mmap_free() stuff for normal CPU pages. Those are refcounted and
> > > qedr can simply call free_page() during the teardown of the uobject
> > > that is using the this page. This is what other drivers already do.
> >
> > This is pretty much what EFA does as well.  When we allocate pages for
> > the user (CQ for example), we DMA map them and later on mmap them to
> > the user. We expect those pages to remain until the entry is freed,
> > how can we call free_page, who is holding a refcount on those except
> > for the driver?
> 
> free_page is kind of a lie, it is more like put_page (see __free_pages). I think
> the difference is that it assumes the page came from alloc_page and skips
> some generic stuff when freeing it.
> 
> When the driver does vm_insert_page the vma holds another refcount and
> the refcount does not go to zero until that page drops out of the vma (ie at
> the same time mmap_free above is called).
> 
> Then __put_page will do the free_unref_page(), etc.
> 
> So for CPU pages it is fine to not use mmap_free so long as vm_insert_page
> is used
Jason, by adding the kref to the rdma_user_mmap_entry  we sort of disable the 
option of being sure the entry is removed from the mmap xarray when it is removed
by the driver (this will only decrease the refcnt).
If we call free_page during the uobject teardown, we can't be sure
the entry is removed from the mmap_xa, this could lead to us having an entry in the mmap_xa
that points to an invalid page. 

Perhaps we could define the entry as being ref-counted or not based on the type of address, but
the original design was to keep this information opaque to the rdma_user_mmap_entry. 
We can also assume that for CPU pages the flow that increases the refcnt won't be called
(mmap_io) but this feels like bad practice. 

How should I handle this?
Thanks,

> 
> Jason




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux