On Fri, Aug 04, 2023 at 05:05:22PM +0200, Petr Pavlu wrote: > Function mlx4_en_queue_bond_work() is used in mlx4_en to start a bond > reconfiguration. It gathers data about a new port map setting, takes > a reference on the netdev that triggered the change and queues a work > object on mlx4_en_priv.mdev.workqueue to perform the operation. The > scheduled work is mlx4_en_bond_work() which calls > mlx4_bond()/mlx4_unbond() and consequently mlx4_do_bond(). > > At the same time, function mlx4_change_port_types() in mlx4_core might > be invoked to change the port type configuration. As part of its logic, > it re-registers the whole device by calling mlx4_unregister_device(), > followed by mlx4_register_device(). > > The two operations can result in concurrent access to the data about > currently active interfaces on the device. > > Functions mlx4_register_device() and mlx4_unregister_device() lock the > intf_mutex to gain exclusive access to this data. The current > implementation of mlx4_do_bond() doesn't do that which could result in > an unexpected behavior. An updated version of mlx4_do_bond() for use > with an auxiliary bus goes and locks the intf_mutex when accessing a new > auxiliary device array. > > However, doing so can then result in the following deadlock: > * A two-port mlx4 device is configured as an Ethernet bond. > * One of the ports is changed from eth to ib, for instance, by writing > into a mlx4_port<x> sysfs attribute file. > * mlx4_change_port_types() is called to update port types. It invokes > mlx4_unregister_device() to unregister the device which locks the > intf_mutex and starts removing all associated interfaces. > * Function mlx4_en_remove() gets invoked and starts destroying its first > netdev. This triggers mlx4_en_netdev_event() which recognizes that the > configured bond is broken. It runs mlx4_en_queue_bond_work() which > takes a reference on the netdev. Removing the netdev now cannot > proceed until the work is completed. > * Work function mlx4_en_bond_work() gets scheduled. It calls > mlx4_unbond() -> mlx4_do_bond(). The latter function tries to lock the > intf_mutex but that is not possible because it is held already by > mlx4_unregister_device(). > > This particular case could be possibly solved by unregistering the > mlx4_en_netdev_event() notifier in mlx4_en_remove() earlier, but it > seems better to decouple mlx4_en more and break this reference order. > > Avoid then this scenario by recognizing that the bond reconfiguration > operates only on a mlx4_dev. The logic to queue and execute the bond > work can be moved into the mlx4_core driver. Only a reference on the > respective mlx4_dev object is needed to be taken during the work's > lifetime. This removes a call from mlx4_en that can directly result in > needing to lock the intf_mutex, it remains a privilege of the core > driver. > > Signed-off-by: Petr Pavlu <petr.pavlu@xxxxxxxx> > Tested-by: Leon Romanovsky <leon@xxxxxxxxxx> > --- > .../net/ethernet/mellanox/mlx4/en_netdev.c | 62 +----------------- > drivers/net/ethernet/mellanox/mlx4/main.c | 65 +++++++++++++++++-- > drivers/net/ethernet/mellanox/mlx4/mlx4.h | 5 ++ > include/linux/mlx4/device.h | 13 ++++ > include/linux/mlx4/driver.h | 19 ------ > 5 files changed, 77 insertions(+), 87 deletions(-) > Thanks, Reviewed-by: Leon Romanovsky <leonro@xxxxxxxxxx>