On 24/05/2022 18:56, Haris Iqbal wrote: > On Tue, May 24, 2022 at 6:00 AM lizhijian@xxxxxxxxxxx > <lizhijian@xxxxxxxxxxx> wrote: >> Hi Jason & Bob >> CC Guoqing >> >> @Guoqing, It may correlate with your previous bug report: https://lore.kernel.org/all/20220210073655.42281-1-guoqing.jiang@xxxxxxxxx/T/ >> >> >> It's observed that a same MR in rnbd server will trigger below code >> path: >> -> rxe_mr_init_fast() >> |-> alloc map_set() # map_set is uninitialized >> |...-> rxe_map_mr_sg() # build the map_set >> |-> rxe_mr_set_page() >> |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means >> # we can access host memory(such rxe_mr_copy) >> |...-> rxe_invalidate_mr() # mr->state change to FREE from VALID >> |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE, >> # but map_set was not built again >> |...-> rxe_mr_copy() # kernel crash due to access wild addresses >> # that lookup from the map_set >> >> I draft a patch like below for it, but i wonder if it's rxe's responsibility to do such checking. >> Any comments are very welcome. >> > Thanks for posting the description and the patch. > > We have been facing the exact same issue (only with rxe), > and we also > realized that to get around this we will have to call ib_map_mr_sg() > before every IB_WR_REG_MR wr; even if we are reusing the MR and simply > invalidating and re-validating them. Yes, this workaround should work but expensive. It seems Bob has started a new thread to discuss the FMRs in https://www.spinics.net/lists/linux-rdma/msg110836.html Thanks Zhijian > > In reference to this, we have 2 questions. > > 1) This change was made in the following commit. > > commit 647bf13ce944f20f7402f281578423a952274e4a > Author: Bob Pearson <rpearsonhpe@xxxxxxxxx> > Date: Tue Sep 14 11:42:06 2021 -0500 > > RDMA/rxe: Create duplicate mapping tables for FMRs > > For fast memory regions create duplicate mapping tables so ib_map_mr_sg() > can build a new mapping table which is then swapped into place > synchronously with the execution of an IB_WR_REG_MR work request. > > Currently the rxe driver uses the same table for receiving RDMA operations > and for building new tables in preparation for reusing the MR. This > exposes users to potentially incorrect results. > > Link: https://lore.kernel.org/r/20210914164206.19768-5-rpearsonhpe@xxxxxxxxx > Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > We tried to understand what potential incorrect result that commit > message talks about, but were not able to. If someone can through > light into this scenario where single mapping table can result into > issues, it would be great. > > 2) > We wanted to confirm that, with the above patch, it clearly means that > the use-case where we reuse the MR, by simply invalidating and > re-validating (IB_WR_REG_MR wr) is a correct one. > And there is no requirement saying that ib_map_mr_sg() needs to be > done everytime regardless. > > (We were planning to send this in the coming days, but wanted other > discussions to get over. Since the patch got posted and discussion > started, we thought better to put this out.) > > Regards > >> >> On 23/05/2022 22:02, Li, Zhijian wrote: >>> on 2022/5/20 22:45, Jason Gunthorpe wrote: >>>> On Wed, May 18, 2022 at 12:37:25PM +0800, Li Zhijian wrote: >>>>> Below call chains will alloc map_set without fully initializing map_set. >>>>> rxe_mr_init_fast() >>>>> -> rxe_mr_alloc() >>>>> -> rxe_mr_alloc_map_set() >>>>> >>>>> Uninitialized values inside struct rxe_map_set are possible to cause >>>>> kernel panic. >>>> If the value is uninitialized then why is 0 an OK value? >>>> >>>> Would be happier to know the exact value that is not initialized >>> Well, good question. After re-think of this issue, it seems this patch wasn't the root cause though it made the crash disappear in some extent. >>> >>> I'm still working on the root cause :) >>> >>> Thanks >>> >>> Zhijian >>> >>> >>>> Jason >>>