On Thu, Oct 26, 2023 at 08:59:34PM +0800, Zhu Yanjun wrote: > 在 2023/10/26 19:42, Jason Gunthorpe 写道: > > On Thu, Oct 26, 2023 at 09:05:52AM +0000, Zhijian Li (Fujitsu) wrote: > > > The root cause is that > > > > > > rxe:rxe_set_page() gets wrong when mr.page_size != PAGE_SIZE where it only stores the *page to xarray. > > > So the offset will get lost. > > > > > > For example, > > > store process: > > > page_size = 0x1000; > > > PAGE_SIZE = 0x10000; > > > va0 = 0xffff000020651000; > > > page_offset = 0 = va & (page_size - 1); > > > page = va_to_page(va); > > > xa_store(&mr->page_list, mr->nbuf, page, GFP_KERNEL); > > > > > > load_process: > > > page = xa_load(&mr->page_list, index); > > > page_va = kmap_local_page(page) --> it must be a PAGE_SIZE align value, assume it as 0xffff000020650000 > > > va1 = page_va + page_offset = 0xffff000020650000 + 0 = 0xffff000020650000; > > > > > > Obviously, *va0 != va1*, page_offset get lost. > > > > > > > > > How to fix: > > > - revert 325a7eb85199 ("RDMA/rxe: Cleanup page variables in rxe_mr.c") > > > - don't allow ulp registering mr.page_size != PAGE_SIZE ? > > > > Lets do the second one please. Most devices only support PAGE_SIZE anyhow. > > Normally page_size is PAGE_SIZE or the size of the whole compound page (in > the latest kernel version, it is the size of folio). When compound page or > folio is taken into account, the page_size is not equal to > PAGE_SIZE. folios are always multiples of PAGE_SIZE. rxe splits everything into PAGE_SIZE units in the xarray. > If the ULP uses the compound page or folio, the similar problem will occur > again. No, it won't. We never store folios in the xarray. Jason