RE: [PATCH 1/1] RDMA/rxe: Fix blktests srp lead kernel panic with 64k page size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 13, 2023 10:44 PM Rain River <rain.1986.08.12@xxxxxxxxx> wrote:
> > On Friday, October 13, 2023 9:29 PM Zhu Yanjun:
> > > 在 2023/10/13 20:01, Daisuke Matsuda (Fujitsu) 写道:
> > > > On Fri, Oct 13, 2023 10:18 AM Zhu Yanjun wrote:
> > > >> From: Zhu Yanjun <yanjun.zhu@xxxxxxxxx>
> > > >>
> > > >> The page_size of mr is set in infiniband core originally. In the commit
> > > >> 325a7eb85199 ("RDMA/rxe: Cleanup page variables in rxe_mr.c"), the
> > > >> page_size is also set. Sometime this will cause conflict.
> > > >
> > > > I appreciate your prompt action, but I do not think this commit deals with
> > > > the root cause. I agree that the problem lies in rxe driver, but what is wrong
> > > > with assigning actual page size to ibmr.page_size?
> > >
> > > Please check the source code. ibmr.page_size is assigned in
> > > infiniband/core. And then it is assigned in rxe.
> > > When the 2 are different, the problem will occur.
> >
> > In the first place, the two must always be equal.
> > Is there any situation there are two different page sizes for a MR?
> > I think I have explained the value assigned in the core layer is wrong
> > when the PAGE_SIZE is bigger than 4k, and that is why they are inconsistent.
> >
> > As I have no environment to test the kernel panic, it remains speculative,
> > but I have provided enough information for everyone to rebut my argument.
> > It is possible I am wrong. I hope someone will tell me specifically where
> > my guess is wrong, or Yi would kindly take the trouble to verify it.
> 
> I made a quick test.
> 
> With Zhu's patch, the problem fixed.
> With your patch, this problem exists. And other problems occur. I do
> not know why.
> Will continue to make tests.

Thank you for your time and work.
I will try to find out why there were two different page sizes from
different perspectives. This may take a while because I am busy
with other projects now. If anybody need the driver without the crash
issue right now, I do not object to partially undoing "RDMA/rxe: Cleanup
page variables in rxe_mr.c" as Zhu suggests, but that should be interim
and not a final solution.

Thanks,
Daisuke

> 
> >
> > Thanks,
> > Daisuke Matsuda
> >
> > > Please add logs to infiniband/core and rxe to check them.
> > >
> > > Zhu Yanjun
> > >
> > > >
> > > > IMO, the problem comes from the device attribute of rxe driver, which is used
> > > > in ulp/srp layer to calculate the page_size.
> > > > =====
> > > > static int srp_add_one(struct ib_device *device)
> > > > {
> > > >          struct srp_device *srp_dev;
> > > >          struct ib_device_attr *attr = &device->attrs;
> > > > <...>
> > > >          /*
> > > >           * Use the smallest page size supported by the HCA, down to a
> > > >           * minimum of 4096 bytes. We're unlikely to build large sglists
> > > >           * out of smaller entries.
> > > >           */
> > > >          mr_page_shift           = max(12, ffs(attr->page_size_cap) - 1);
> > > >          srp_dev->mr_page_size   = 1 << mr_page_shift;
> > > > =====
> > > > On initialization of srp driver, mr_page_size is calculated here.
> > > > Note that the device attribute is used to calculate the value of page shift
> > > > when the device is trying to use a page size larger than 4096. Since Yi specified
> > > > CONFIG_ARM64_64K_PAGES, the system naturally met the condition.
> > > >
> > > > =====
> > > > static int srp_map_finish_fr(struct srp_map_state *state,
> > > >                               struct srp_request *req,
> > > >                               struct srp_rdma_ch *ch, int sg_nents,
> > > >                               unsigned int *sg_offset_p)
> > > > {
> > > >          struct srp_target_port *target = ch->target;
> > > >          struct srp_device *dev = target->srp_host->srp_dev;
> > > > <...>
> > > >          n = ib_map_mr_sg(desc->mr, state->sg, sg_nents, sg_offset_p,
> > > >                           dev->mr_page_size);
> > > > =====
> > > > After that, mr_page_size is presumably passed to ib_core layer.
> > > >
> > > > =====
> > > > int ib_map_mr_sg(struct ib_mr *mr, struct scatterlist *sg, int sg_nents,
> > > >                   unsigned int *sg_offset, unsigned int page_size)
> > > > {
> > > >          if (unlikely(!mr->device->ops.map_mr_sg))
> > > >                  return -EOPNOTSUPP;
> > > >
> > > >          mr->page_size = page_size;
> > > >
> > > >          return mr->device->ops.map_mr_sg(mr, sg, sg_nents, sg_offset);
> > > > }
> > > > EXPORT_SYMBOL(ib_map_mr_sg);
> > > > =====
> > > > Consequently, the page size calculated in srp driver is set to ibmr.page_size.
> > > >
> > > > Coming back to rxe, the device attribute is set here:
> > > > =====
> > > > rxe.c
> > > > <...>
> > > > /* initialize rxe device parameters */
> > > > static void rxe_init_device_param(struct rxe_dev *rxe)
> > > > {
> > > >          rxe->max_inline_data                    = RXE_MAX_INLINE_DATA;
> > > >
> > > >          rxe->attr.vendor_id                     = RXE_VENDOR_ID;
> > > >          rxe->attr.max_mr_size                   = RXE_MAX_MR_SIZE;
> > > >          rxe->attr.page_size_cap                 = RXE_PAGE_SIZE_CAP;
> > > >          rxe->attr.max_qp                        = RXE_MAX_QP;
> > > > ---
> > > > rxe_param.h
> > > > <...>
> > > > /* default/initial rxe device parameter settings */
> > > > enum rxe_device_param {
> > > >          RXE_MAX_MR_SIZE                 = -1ull,
> > > >          RXE_PAGE_SIZE_CAP               = 0xfffff000,
> > > >          RXE_MAX_QP_WR                   = DEFAULT_MAX_VALUE,
> > > > =====
> > > > rxe_init_device_param() sets the attributes to rxe_dev->attr, and it is later copied
> > > > to ib_device->attrs in setup_device()@core/device.c.
> > > > See that the page size cap is hardcoded to 4096 bytes. I suspect this led to
> > > > incorrect page_size being set to ibmr.page_size, resulting in the kernel crash.
> > > >
> > > > I think rxe driver is made to be able to handle arbitrary page sizes.
> > > > Probably, we can just modify RXE_PAGE_SIZE_CAP to fix the issue.
> > > > What do you guys think?
> > > >
> > > > Thanks,
> > > > Daisuke Matsuda
> > > >
> >




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux