On Tue, Nov 22, 2022 at 5:31 AM Moshe Shemesh <moshe@xxxxxxxxxx> wrote: > > > On 11/21/2022 11:11 AM, Jinpu Wang wrote: > > External email: Use caution opening links or attachments > > > > > > On Tue, Nov 15, 2022 at 5:41 PM Moshe Shemesh <moshe@xxxxxxxxxx> wrote: > >> > >> On 11/15/2022 5:08 PM, Jinpu Wang wrote: > >>> On Tue, Nov 15, 2022 at 6:46 AM Jinpu Wang <jinpu.wang@xxxxxxxxx> wrote: > >>>> On Tue, Nov 15, 2022 at 6:15 AM Moshe Shemesh <moshe@xxxxxxxxxx> wrote: > >>>>> On 11/9/2022 11:51 AM, Jinpu Wang wrote: > >>>>>> On Mon, Oct 17, 2022 at 7:54 AM Jinpu Wang <jinpu.wang@xxxxxxxxx> wrote: > >>>>>>> On Thu, Oct 13, 2022 at 12:27 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote: > >>>>>>>> On Thu, Oct 13, 2022 at 10:32:55AM +0200, Jinpu Wang wrote: > >>>>>>>>> On Thu, Oct 13, 2022 at 10:18 AM Leon Romanovsky <leon@xxxxxxxxxx> wrote: > >>>>>>>>>> On Wed, Oct 12, 2022 at 01:55:55PM +0200, Jinpu Wang wrote: > >>>>>>>>>>> Hi Leon, hi Saeed, > >>>>>>>>>>> > >>>>>>>>>>> We have seen crashes during server shutdown on both kernel 5.10 and > >>>>>>>>>>> kernel 5.15 with GPF in mlx5 mlx5_cmd_comp_handler function. > >>>>>>>>>>> > >>>>>>>>>>> All of the crashes point to > >>>>>>>>>>> > >>>>>>>>>>> 1606 memcpy(ent->out->first.data, > >>>>>>>>>>> ent->lay->out, sizeof(ent->lay->out)); > >>>>>>>>>>> > >>>>>>>>>>> I guess, it's kind of use after free for ent buffer. I tried to reprod > >>>>>>>>>>> by repeatedly reboot the testing servers, but no success so far. > >>>>>>>>>> My guess is that command interface is not flushed, but Moshe and me > >>>>>>>>>> didn't see how it can happen. > >>>>>>>>>> > >>>>>>>>>> 1206 INIT_DELAYED_WORK(&ent->cb_timeout_work, cb_timeout_handler); > >>>>>>>>>> 1207 INIT_WORK(&ent->work, cmd_work_handler); > >>>>>>>>>> 1208 if (page_queue) { > >>>>>>>>>> 1209 cmd_work_handler(&ent->work); > >>>>>>>>>> 1210 } else if (!queue_work(cmd->wq, &ent->work)) { > >>>>>>>>>> ^^^^^^^ this is what is causing to the splat > >>>>>>>>>> 1211 mlx5_core_warn(dev, "failed to queue work\n"); > >>>>>>>>>> 1212 err = -EALREADY; > >>>>>>>>>> 1213 goto out_free; > >>>>>>>>>> 1214 } > >>>>>>>>>> > >>>>>>>>>> <...> > >>>>>>>>>>> Is this problem known, maybe already fixed? > >>>>>>>>>> I don't see any missing Fixes that exist in 6.0 and don't exist in 5.5.32. > >>>>>>>> Sorry it is 5.15.32 > >>>>>>>> > >>>>>>>>>> Is it possible to reproduce this on latest upstream code? > >>>>>>>>> I haven't been able to reproduce it, as mentioned above, I tried to > >>>>>>>>> reproduce by simply reboot in loop, no luck yet. > >>>>>>>>> do you have suggestions to speedup the reproduction? > >>>>>>>> Maybe try to shutdown during filling command interface. > >>>>>>>> I think that any query command will do the trick. > >>>>>>> Just an update. > >>>>>>> I tried to run "saquery" in a loop in one session and do "modproble -r > >>>>>>> mlx5_ib && modprobe mlx5_ib" in loop in another session during last > >>>>>>> days , but still no luck. --c > >>>>>>>>> Once I can reproduce, I can also try with kernel 6.0. > >>>>>>>> It will be great. > >>>>>>>> > >>>>>>>> Thanks > >>>>>>> Thanks! > >>>>>> Just want to mention, we see more crash during reboot, all the crash > >>>>>> we saw are all > >>>>>> Intel Intel(R) Xeon(R) Gold 6338 CPU. We use the same HCA on > >>>>>> different servers. So I suspect the bug is related to Ice Lake server. > >>>>>> > >>>>>> In case it matters, here is lspci attached. > >>>>> Please try the following change on 5.15.32, let me know if it solves the > >>>>> failure : > >>>> Thank you Moshe, I will test it on affected servers and report back the result. > >>> Hi Moshe, > >>> > >>> I've been running the reboot tests on 4 affected machines in parallel > >>> for more than 6 hours, in total did 300+ reboot, I can no longer > >>> reproduce the crash. without the fix, I was able to reproduce 2 times > >>> in 20 reboots. > >>> So I think the bug is fixed. > >> > >> Great ! > >> > >>> I also did some basic functional test via RNBD/IPOIB, all look good. > >>> Tested-by: Jack Wang <jinpu.wang@xxxxxxxxx> > >>> Please provide a formal fix. > >> > >> Will do. > > Hi Moshe, > > A gentle ping, when will you send the fix? > > > > Thanks! > > Hi, it is part of Saeed's mlx5 fixes patchset. > > He sent it a couple of hours ago. Yes, indeed. ref: https://lore.kernel.org/netdev/20221122022559.89459-6-saeed@xxxxxxxxxx/T/#u Thx! > > > > >> Thanks! > >> > >>> Thx! > >>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > >>>>> b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > >>>>> index e06a6104e91f..d45ca9c52a21 100644 > >>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > >>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > >>>>> @@ -971,6 +971,7 @@ static void cmd_work_handler(struct work_struct *work) > >>>>> cmd_ent_get(ent); > >>>>> set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state); > >>>>> > >>>>> + cmd_ent_get(ent); /* for the _real_ FW event on completion */ > >>>>> /* Skip sending command to fw if internal error */ > >>>>> if (mlx5_cmd_is_down(dev) || !opcode_allowed(&dev->cmd, ent->op)) { > >>>>> u8 status = 0; > >>>>> @@ -984,7 +985,6 @@ static void cmd_work_handler(struct work_struct *work) > >>>>> return; > >>>>> } > >>>>> > >>>>> - cmd_ent_get(ent); /* for the _real_ FW event on completion */ > >>>>> /* ring doorbell after the descriptor is valid */ > >>>>> mlx5_core_dbg(dev, "writing 0x%x to command doorbell\n", 1 << > >>>>> ent->idx); > >>>>> wmb(); > >>>>> @@ -1598,8 +1598,8 @@ static void mlx5_cmd_comp_handler(struct > >>>>> mlx5_core_dev *dev, u64 vec, bool force > >>>>> cmd_ent_put(ent); /* timeout work was > >>>>> canceled */ > >>>>> > >>>>> if (!forced || /* Real FW completion */ > >>>>> - pci_channel_offline(dev->pdev) || /* FW is > >>>>> inaccessible */ > >>>>> - dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) > >>>>> + mlx5_cmd_is_down(dev) || /* No real FW > >>>>> completion is expected */ > >>>>> + !opcode_allowed(cmd, ent->op)) > >>>>> cmd_ent_put(ent); > >>>>> > >>>>> ent->ts2 = ktime_get_ns(); > >>>>> > >>>>>> Thx!