Add and Hi Bart, Yi reported a crash[1] when PAGE_SIZE is 64K [1]https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@xxxxxxxxxxxxxx/T/ The root cause is unknown so far, but I notice that SRP over RXE always uses MR with page_size 4K = MAX(4K, min(device_support_page_size)) even though the device supports 64K page size. * RXE device support 4K ~ 2G page size 4024 /* 4025 * Use the smallest page size supported by the HCA, down to a 4026 * minimum of 4096 bytes. We're unlikely to build large sglists 4027 * out of smaller entries. 4028 */ 4029 mr_page_shift = max(12, ffs(attr->page_size_cap) - 1); 4030 srp_dev->mr_page_size = 1 << mr_page_shift; 4031 srp_dev->mr_page_mask = ~((u64) srp_dev->mr_page_size - 1); 4032 max_pages_per_mr = attr->max_mr_size; I doubt if we can use PAGE_SIZE MR if the device supports it. BTW, rtrs seems have the same code. @rtrs Thanks Zhijian On 20/10/2023 11:47, Zhijian Li (Fujitsu) wrote: > CC Bart > > On 13/10/2023 20:01, Daisuke Matsuda (Fujitsu) wrote: >> On Fri, Oct 13, 2023 10:18 AM Zhu Yanjun wrote: >>> From: Zhu Yanjun<yanjun.zhu@xxxxxxxxx> >>> >>> The page_size of mr is set in infiniband core originally. In the commit >>> 325a7eb85199 ("RDMA/rxe: Cleanup page variables in rxe_mr.c"), the >>> page_size is also set. Sometime this will cause conflict. >> I appreciate your prompt action, but I do not think this commit deals with >> the root cause. I agree that the problem lies in rxe driver, but what is wrong >> with assigning actual page size to ibmr.page_size? >> >> IMO, the problem comes from the device attribute of rxe driver, which is used >> in ulp/srp layer to calculate the page_size. >> ===== >> static int srp_add_one(struct ib_device *device) >> { >> struct srp_device *srp_dev; >> struct ib_device_attr *attr = &device->attrs; >> <...> >> /* >> * Use the smallest page size supported by the HCA, down to a >> * minimum of 4096 bytes. We're unlikely to build large sglists >> * out of smaller entries. >> */ >> mr_page_shift = max(12, ffs(attr->page_size_cap) - 1); > > > You light me up. > RXE provides attr.page_size_cap(RXE_PAGE_SIZE_CAP) which means it can support 4K-2G page size > > so i think more accurate logic should be: > > if (device supports PAGE_SIZE) > use PAGE_SIZE > else if (device support 4096 page_size) // fallback to 4096 > use 4096 etc... > else > ... > > > > >> srp_dev->mr_page_size = 1 << mr_page_shift; >> ===== >> On initialization of srp driver, mr_page_size is calculated here. >> Note that the device attribute is used to calculate the value of page shift >> when the device is trying to use a page size larger than 4096. Since Yi specified >> CONFIG_ARM64_64K_PAGES, the system naturally met the condition.