Re: [PATCH 1/1] RDMA/rxe: Fix blktests srp lead kernel panic with 64k page size

Yi Zhang <yi.zhang@xxxxxxxxxx> · Mon, 23 Oct 2023 18:45:40 +0800

On Mon, Oct 23, 2023 at 11:52 AM Zhijian Li (Fujitsu)
<lizhijian@xxxxxxxxxxx> wrote:
>
>
>
> On 20/10/2023 22:01, Jason Gunthorpe wrote:
> > On Fri, Oct 20, 2023 at 03:47:05AM +0000, Zhijian Li (Fujitsu) wrote:
> >> CC Bart
> >>
> >> On 13/10/2023 20:01, Daisuke Matsuda (Fujitsu) wrote:
> >>> On Fri, Oct 13, 2023 10:18 AM Zhu Yanjun wrote:
> >>>> From: Zhu Yanjun<yanjun.zhu@xxxxxxxxx>
> >>>>
> >>>> The page_size of mr is set in infiniband core originally. In the commit
> >>>> 325a7eb85199 ("RDMA/rxe: Cleanup page variables in rxe_mr.c"), the
> >>>> page_size is also set. Sometime this will cause conflict.
> >>> I appreciate your prompt action, but I do not think this commit deals with
> >>> the root cause. I agree that the problem lies in rxe driver, but what is wrong
> >>> with assigning actual page size to ibmr.page_size?
> >>>
> >>> IMO, the problem comes from the device attribute of rxe driver, which is used
> >>> in ulp/srp layer to calculate the page_size.
> >>> =====
> >>> static int srp_add_one(struct ib_device *device)
> >>> {
> >>>           struct srp_device *srp_dev;
> >>>           struct ib_device_attr *attr = &device->attrs;
> >>> <...>
> >>>           /*
> >>>            * Use the smallest page size supported by the HCA, down to a
> >>>            * minimum of 4096 bytes. We're unlikely to build large sglists
> >>>            * out of smaller entries.
> >>>            */
> >>>           mr_page_shift           = max(12, ffs(attr->page_size_cap) - 1);
> >>
> >>
> >> You light me up.
> >> RXE provides attr.page_size_cap(RXE_PAGE_SIZE_CAP) which means it can support 4K-2G page size
> >
> > That doesn't seem right even in concept.>
> > I think the multi-size support in the new xarray code does not work
> > right, just looking at it makes me think it does not work right. It
> > looks like it can do less than PAGE_SIZE but more than PAGE_SIZE will
> > explode because kmap_local_page() does only 4K.
> >
> > If RXE_PAGE_SIZE_CAP == PAGE_SIZE  will everything work?
> >
>
> Yeah, this should work(even though i only verified hardcoding mr_page_shift to PAGE_SHIFT).

Hi Zhijian

Did you try blktests nvme/rdma use_rxe on your environment, it still
failed on my side.

# use_rxe=1 nvme_trtype=rdma  ./check nvme/003
nvme/003 (test if we're sending keep-alives to a discovery controller) [failed]
    runtime  12.179s  ...  11.941s
    --- tests/nvme/003.out 2023-10-22 10:54:43.041749537 -0400
    +++ /root/blktests/results/nodev/nvme/003.out.bad 2023-10-23
05:52:27.882759168 -0400
    @@ -1,3 +1,3 @@
     Running nvme/003
    -NQN:nqn.2014-08.org.nvmexpress.discovery disconnected 1 controller(s)
    +NQN:nqn.2014-08.org.nvmexpress.discovery disconnected 0 controller(s)
     Test complete

[ 7033.431910] rdma_rxe: loaded
[ 7033.456341] run blktests nvme/003 at 2023-10-23 05:52:15
[ 7033.502306] (null): rxe_set_mtu: Set mtu to 1024
[ 7033.510969] infiniband enP2p1s0v0_rxe: set active
[ 7033.510980] infiniband enP2p1s0v0_rxe: added enP2p1s0v0
[ 7033.549301] loop0: detected capacity change from 0 to 2097152
[ 7033.556966] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 7033.566711] nvmet_rdma: enabling port 0 (10.19.240.81:4420)
[ 7033.588605] nvmet: connect attempt for invalid controller ID 0x808
[ 7033.594909] nvme nvme0: Connect Invalid Data Parameter, cntlid: 65535
[ 7033.601504] nvme nvme0: failed to connect queue: 0 ret=16770
[ 7046.317861] rdma_rxe: unloaded

>
> >>> import ctypes
> >>> libc = ctypes.cdll.LoadLibrary('libc.so.6')
> >>> hex(65536)
> '0x10000'
> >>> libc.ffs(0x10000) - 1
> 16
> >>> 1 << 16
> 65536
>
> so
> mr_page_shift = max(12, ffs(attr->page_size_cap) - 1) = max(12, 16) = 16;
>
>
> So I think Daisuke's patch should work as well.
>
> https://lore.kernel.org/linux-rdma/OS3PR01MB98652B2EC2E85DAEC6DDE484E5D2A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#md133060414f0ba6a3dbaf7b4ad2374c8a347cfd1
>
>
> > Jason

--
Best Regards,
  Yi Zhang