Re: [RESEND RFC PATCH for-next] Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"

Bob Pearson <rpearsonhpe@xxxxxxxxx> · Mon, 25 Jul 2022 14:15:02 -0500

On 7/24/22 23:00, lizhijian@xxxxxxxxxxx wrote:
> 
> 
> On 22/07/2022 02:18, Bob Pearson wrote:
>> On 7/20/22 05:50, Haris Iqbal wrote:
>>> On Wed, Jul 20, 2022 at 12:22 PM Li Zhijian <lizhijian@xxxxxxxxxxx> wrote:
>>>> Below 2 commits will be reverted:
>>>>       8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
>>>>       647bf13ce944 ("RDMA/rxe: Create duplicate mapping tables for FMRs")
>>>>
>>>> The community has a few bug reports which pointed this commit at last.
>>>> Some proposals are raised up in the meantime but all of them have no
>>>> follow-up operation.
>>>>
>>>> The previous commit led the map_set of FMR to be not avaliable any more if
>>>> the MR is registered again after invalidating. Although the mentioned
>>>> patch try to fix a potential race in building/accessing the same table
>>>> for fast memory regions, it broke rnbd etc ULPs. Since the latter could
>>>> be worse, revert this patch.
>>>>
>>>> With previous commit, it's observed that a same MR in rnbd server will
>>>> trigger below code path:
>>> Looks Good. I tested the patch against rdma for-next and it solves the
>>> problem mentioned in the commit.
>>> One small nitpick. It should be rtrs, and not rnbd in the commit message.
>>>
>>> Feel free to add my,
>>>
>>> Tested-by: Md Haris Iqbal <haris.iqbal@xxxxxxxxx>
>>>
>> Li,
>>
>> It has been a while since this was added. If I recall there was a problem in rnfs
>> that this was supposed to fix. It was also supposed to allow overlap of using the
>> previous mappings and the driver creating new ones. But it seems that most fmr
>> based ulps don't require it, maybe all. Before we do this we should make sure that
>> blktests, srp, lustre, rnfs, etc all work. Have these been tested?
> 
> blktests(nvme over RXE and srp) works fine after this reverting.
> lustre and rnfs have not tested because I have no lustre and rnfs local environment currently.
> 
> I do wish to know what's the original problem you fixed in 647bf13ce944 ("RDMA/rxe: Create duplicate mapping tables for FMRs")
> Could we have other approaches for it such as add locks to prevent the potential *race*.
> 
> I agreed on the view[1]("you need to go back to one map") from Jason
> 
> [1]: https://lore.kernel.org/all/20220527124240.GB2960187@xxxxxxxx/
> 
> Thanks
> ZHijian
> 
>>
>> Bob

Li,

I agree. You can add

Reviewed-by: Bob Pearson <rpearsonhpe@xxxxxxxxx>

Our Lustre testing is still on older versions of the driver with one map and it works fine.
I am not able to reproduce the rnfs results from last year so I just don't know.

I still have failures in srp blktests but I doubt it is related to this issue. Tests 002 and 011
seem to hang and I have never been able to figure out why.

I suspect that mr->state can be racy. It is a state machine that can trigger changes from client code or
tasklet code on the request side or the response side. I don't have solid evidence that this has happened
but it seems to me like a good idea to guard the state machine with a spin lock. I will post a patch that
does that.

Bob