Hello Yi, As I replied in other thread, I believe the issue comes from a device attribute of rxe driver, which is hardcoded for 4k page systems. Cf. https://lore.kernel.org/all/OS3PR01MB98651C7454C46841B8A78F11E5D2A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ Unfortunately, I have no aarch64 machine available to verify that. Sorry to trouble you, but could you apply the change below to see if the issue is resolved with it or not? ===== diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h index d2f57ead78ad..dc0f28c264b9 100644 --- a/drivers/infiniband/sw/rxe/rxe_param.h +++ b/drivers/infiniband/sw/rxe/rxe_param.h @@ -38,7 +38,7 @@ static inline enum ib_mtu eth_mtu_int_to_enum(int mtu) /* default/initial rxe device parameter settings */ enum rxe_device_param { RXE_MAX_MR_SIZE = -1ull, - RXE_PAGE_SIZE_CAP = 0xfffff000, + RXE_PAGE_SIZE_CAP = 0xffffffff - (PAGE_SIZE - 1), RXE_MAX_QP_WR = DEFAULT_MAX_VALUE, RXE_DEVICE_CAP_FLAGS = IB_DEVICE_BAD_PKEY_CNTR ===== Regards, Daisuke Matsuda On Wed, Oct 11, 2023 9:33 AM Yi Zhang wrote: > On Tue, Oct 10, 2023 at 9:37 PM Zhu Yanjun <yanjun.zhu@xxxxxxxxx> wrote: > > > > > > 在 2023/10/10 19:35, Jason Gunthorpe 写道: > > > On Tue, Oct 10, 2023 at 06:41:17PM +0800, Zhu Yanjun wrote: > > >> 在 2023/10/9 12:35, Yi Zhang 写道: > > >>> Hello > > >>> > > >>> blktests srp lead kernel panic[2] on aarch64 when the kernel enabled > > >>> CONFIG_ARM64_64K_PAGES, bisect shows it was introduced from commit[1], > > >>> pls help check it and let me know if you need any info/testing for it, thanks. > > >>> > > >>> [1] > > >>> commit 325a7eb85199ec9c5b5a7af812f43ea16b735569 > > >>> Author: Bob Pearson <rpearsonhpe@xxxxxxxxx> > > >>> Date: Thu Jan 19 17:59:36 2023 -0600 > > >>> > > >>> RDMA/rxe: Cleanup page variables in rxe_mr.c > > >>> > > >>> Cleanup usage of mr->page_shift and mr->page_mask and introduce > > >>> an extractor for mr->ibmr.page_size. Normal usage in the kernel > > >>> has page_mask masking out offset in page rather than masking out > > >>> the page number. The rxe driver had reversed that which was confusing. > > >>> Implicitly there can be a per mr page_size which was not uniformly > > >>> supported. > > >>> > > >>> Link: https://lore.kernel.org/r/20230119235936.19728-6-rpearsonhpe@xxxxxxxxx > > >>> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> > > >>> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > >>> > > >> Hi, Yi > > >> > > >> I delved into the commit. And the commit can not be reverted cleanly. So I > > >> made the following diff to try to revert this commit. After this commit is > > >> applied, rping can work well. > > Hi Yanjun > > With the change, the blktests srp works now. > > > > We can't keep reverting things for what are probably small bugs. Fix > > > the issues please! > > > > > > This is not an official commit. Because the reporter mentioned that the > > commit causes this problem, > > > > we just confirmed that. If we confirmed that this commit is the root > > cause, we will analyze this commit, > > > > then fix it. > > > > Zhu Yanjun > > > > > > > > > > Jason > > > > -- > Best Regards, > Yi Zhang