> -----Original Message----- > From: Leon Romanovsky <leon@xxxxxxxxxx> > Sent: Thursday, November 2, 2023 1:32 PM > To: Maxim Samoylov <max7255@xxxxxxxx>; Bernard Metzler > <BMT@xxxxxxxxxxxxxx>; Dennis Dalessandro > <dennis.dalessandro@xxxxxxxxxxxxxxxxxxxx> > Cc: Guoqing Jiang <guoqing.jiang@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; > Jason Gunthorpe <jgg@xxxxxxxx>; Christian Benvenuti <benve@xxxxxxxxx>; > Vadim Fedorenko <vadim.fedorenko@xxxxxxxxx> > Subject: [EXTERNAL] Re: [PATCH v2] IB: rework memlock limit handling code > > On Tue, Oct 31, 2023 at 01:30:27PM +0000, Maxim Samoylov wrote: > > On 23/10/2023 07:52, Leon Romanovsky wrote: > > > On Mon, Oct 23, 2023 at 09:40:16AM +0800, Guoqing Jiang wrote: > > >> > > >> > > >> On 10/15/23 17:19, Leon Romanovsky wrote: > > >>> On Thu, Oct 12, 2023 at 01:29:21AM -0700, Maxim Samoylov wrote: > > >>>> This patch provides the uniform handling for RLIM_INFINITY value > > >>>> across the infiniband/rdma subsystem. > > >>>> > > >>>> Currently in some cases the infinity constant is treated > > >>>> as an actual limit value, which could be misleading. > > >>>> > > >>>> Let's also provide the single helper to check against process > > >>>> MEMLOCK limit while registering user memory region mappings. > > >>>> > > >>>> Signed-off-by: Maxim Samoylov<max7255@xxxxxxxx> > > >>>> --- > > >>>> > > >>>> v1 -> v2: rewritten commit message, rebased on recent upstream > > >>>> > > >>>> drivers/infiniband/core/umem.c | 7 ++----- > > >>>> drivers/infiniband/hw/qib/qib_user_pages.c | 7 +++---- > > >>>> drivers/infiniband/hw/usnic/usnic_uiom.c | 6 ++---- > > >>>> drivers/infiniband/sw/siw/siw_mem.c | 6 +++--- > > >>>> drivers/infiniband/sw/siw/siw_verbs.c | 23 ++++++++++------- > ----- > > >>>> include/rdma/ib_umem.h | 11 +++++++++++ > > >>>> 6 files changed, 31 insertions(+), 29 deletions(-) > > >>> <...> > > >>> > > >>>> @@ -1321,8 +1322,8 @@ struct ib_mr *siw_reg_user_mr(struct ib_pd > *pd, u64 start, u64 len, > > >>>> struct siw_umem *umem = NULL; > > >>>> struct siw_ureq_reg_mr ureq; > > >>>> struct siw_device *sdev = to_siw_dev(pd->device); > > >>>> - > > >>>> - unsigned long mem_limit = rlimit(RLIMIT_MEMLOCK); > > >>>> + unsigned long num_pages = > > >>>> + (PAGE_ALIGN(len + (start & ~PAGE_MASK))) >> PAGE_SHIFT; > > >>>> int rv; > > >>>> siw_dbg_pd(pd, "start: 0x%pK, va: 0x%pK, len: %llu\n", > > >>>> @@ -1338,19 +1339,15 @@ struct ib_mr *siw_reg_user_mr(struct ib_pd > *pd, u64 start, u64 len, > > >>>> rv = -EINVAL; > > >>>> goto err_out; > > >>>> } > > >>>> - if (mem_limit != RLIM_INFINITY) { > > >>>> - unsigned long num_pages = > > >>>> - (PAGE_ALIGN(len + (start & ~PAGE_MASK))) >> > PAGE_SHIFT; > > >>>> - mem_limit >>= PAGE_SHIFT; > > >>>> - if (num_pages > mem_limit - current->mm->locked_vm) { > > >>>> - siw_dbg_pd(pd, "pages req %lu, max %lu, lock %lu\n", > > >>>> - num_pages, mem_limit, > > >>>> - current->mm->locked_vm); > > >>>> - rv = -ENOMEM; > > >>>> - goto err_out; > > >>>> - } > > >>>> + if (!ib_umem_check_rlimit_memlock(num_pages + current->mm- > >locked_vm)) { > > >>>> + siw_dbg_pd(pd, "pages req %lu, max %lu, lock %lu\n", > > >>>> + num_pages, rlimit(RLIMIT_MEMLOCK), > > >>>> + current->mm->locked_vm); > > >>>> + rv = -ENOMEM; > > >>>> + goto err_out; > > >>>> } > > >>> Sorry for late response, but why does this hunk exist in first place? > > >>> > > > > Trailing newline, will definitely drop it. > > > > >>>> + > > >>>> umem = siw_umem_get(start, len, ib_access_writable(rights)); > > >>> This should be ib_umem_get(). > > >> > > >> IMO, it deserves a separate patch, and replace siw_umem_get with > ib_umem_get > > >> is not straightforward given siw_mem has two types of memory (pbl and > umem). > > > > > > The thing is that once you convince yourself that SIW should use > ib_umem_get(), > > > the same question will arise for other parts of this patch where > > > ib_umem_check_rlimit_memlock() is used. > > > > > > And if we eliminate them all, there won't be a need for this new API > call at all. > > > > > > Thanks > > > > > > > Hi! > > > > So, as for 31.10.2023 I still see siw_umem_get() call used in > > linux-rdma repo in "for-next" branch. > > I hoped to hear some feedback from Bernard and Dennis. Sorry for the late reply. Not using ib_umem_get(), siw intentionally implements a simpler routine to get its user pages pinned and organized as a two dimensional list. It not only saves memory for the structure holding the pages, but also makes it very easy to resume sending or receiving data at any address without traversing ib_umem linearly. What we could do is using ib_umem_get() to get the page list and use that list to organize pages more efficiently for a software rdma driver. I think this is what rxe implements today. It comes with the penalty of extra data structures being allocated a software rdma driver does not care about. > > > > > AFAIU this helper call is used only in a single place and could > > potentially be replaced with ib_umem_get() as Leon suggests. > > > > But should we perform it right inside this memlock helper patch? > > > > I can submit later another patch with siw_umem_get() replaced > > if necessary. > > > > > > >> > > >> Thanks, > > >> Guoqing > >