On 6/13/24 10:54, Larysa Zaremba wrote:
On Wed, Jun 12, 2024 at 02:09:35PM -0700, Jakub Kicinski wrote:
On Wed, 12 Jun 2024 08:56:38 +0200 Larysa Zaremba wrote:
On Tue, Jun 11, 2024 at 07:38:37PM -0700, Jakub Kicinski wrote:
On Mon, 10 Jun 2024 17:37:12 +0200 Larysa Zaremba wrote:
Fix the problems that are triggered by tx_timeout and ice_xdp() calls,
including both pool and program operations.
Is there really no way for ice to fix the locking? :(
The busy loops and trylocks() are not great, and seem like duct tape.
The locking mechanisms I use here do not look pretty, but if I am not missing
anything, the synchronization they provide must be robust.
Robust as in they may be correct here, but you lose lockdep and all
other infra normal mutex would give you.
I know, but __netif_queue_set_napi() requires rtnl_lock() inside the potential
critical section and creates a deadlock this way. However, after reading
patches that introduce this function, I think it is called too early in the
configuration. Seems like it should be called somewhere right after
netif_set_real_num_rx/_tx_queues(), much later in the configuration where we
already hold the rtnl_lock(). In such way, ice_vsi_rebuild() could be protected
with an internal mutex. WDYT?
A prettier way of protecting the same critical sections would be replacing
ICE_CFG_BUSY around ice_vsi_rebuild() with rtnl_lock(), this would eliminate
locking code from .ndo_bpf() altogether, ice_rebuild_pending() logic will have
to stay.
At some point I have decided to avoid using rtnl_lock(), if I do not have to. I
think this is a goal worth pursuing?
Is the reset for failure recovery, rather than reconfiguration?
If so netif_device_detach() is generally the best way of avoiding
getting called (I think I mentioned it to someone @intal recently).
for the reference, it was to Dawid here:
https://lore.kernel.org/netdev/20240610194756.5be5be90@xxxxxxxxxx/
AFAIK, netif_device_detach() does not affect .ndo_bpf() calls. We were trying
such approach with idpf and it does work for ethtool, but not for XDP.