On 7/25/22 9:13 AM, lizhijian@xxxxxxxxxxx wrote:
On 22/07/2022 06:19, Bob Pearson wrote:
On 7/20/22 05:50, Haris Iqbal wrote:
On Wed, Jul 20, 2022 at 12:22 PM Li Zhijian<lizhijian@xxxxxxxxxxx> wrote:
Below 2 commits will be reverted:
8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
647bf13ce944 ("RDMA/rxe: Create duplicate mapping tables for FMRs")
The community has a few bug reports which pointed this commit at last.
Some proposals are raised up in the meantime but all of them have no
follow-up operation.
The previous commit led the map_set of FMR to be not avaliable any more if
the MR is registered again after invalidating. Although the mentioned
patch try to fix a potential race in building/accessing the same table
for fast memory regions, it broke rnbd etc ULPs. Since the latter could
be worse, revert this patch.
With previous commit, it's observed that a same MR in rnbd server will
trigger below code path:
Looks Good. I tested the patch against rdma for-next and it solves the
problem mentioned in the commit.
One small nitpick. It should be rtrs, and not rnbd in the commit message.
Feel free to add my,
Tested-by: Md Haris Iqbal<haris.iqbal@xxxxxxxxx>
-> rxe_mr_init_fast()
|-> alloc map_set() # map_set is uninitialized
|...-> rxe_map_mr_sg() # build the map_set
|-> rxe_mr_set_page()
|...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means
# we can access host memory(such rxe_mr_copy)
|...-> rxe_invalidate_mr() # mr->state change to FREE from VALID
|...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE,
# but map_set was not built again
|...-> rxe_mr_copy() # kernel crash due to access wild addresses
# that lookup from the map_set
Where is the use case for this? All the FMR examples I am aware of call rxe_map_mr_sg()
between each reg_fast_mr/invalidate_mr() sequence. I am not familiar with rtrs.
What is it?
it would happen when we are creating a rnbd connection.
To be accurate, it is rtrs connection.
modprobe rnbd_server
modprobe rnbd_client
echo "sessname=foo path=ip:<server-ip> device_path=/dev/nvme0n1" > /sys/devices/virtual/rnbd-client/ctl/map_device
I have tested blktests and above rnbd case, they works fine.
Further more, your "[PATCH RFC] RDMA/rxe: Allow re-registration of FMRs" does'n fix the above rnbd use case.
Thanks for the effort! I believe rnbd/rtrs over rxe had been broken for
a while, can we agree
the problem need to be fixed?
Thanks,
Guoqing