On 1/18/25 02:08, Jakub Kicinski wrote:
On Fri, 17 Jan 2025 23:20:58 +0000 Pavel Begunkov wrote:
True, so not twice, but the race is there. It's not correct to call
ops of a device which has already been unregistered.
Ok, from what you're saying it's regardless of the netdev still
having refs lingering. In this case it was better a version ago
where io_uring was just taking the rtnl lock, which protects
against concurrent unregistration while io_uring is checking
netdev.
Yes, v9 didn't have this race, it just didn't release the netdev ref
correctly. Plus we plan to lift the rtnl_lock requirement on this API
in 6.14, so the locking details best live under net/
The change I suggested to earlier should be fine.
- If uninstall path wins it will clear and put the netdev under the
spin lock and the close path will do nothing.
- If the close path grabs the netdev pointer the uninstall path will
do nothing in io_uring, just clear the pointers in net/ side. Then
the close path will grab the lock in net_mp_open_rxq() see the netdev
as unregistered, return early, put the ref.
Did I miss something?
That should work, but it's also a house of cards comparing to the
alternative, that netdev trickery with bunch of sync around is a
direct product of that. It absolutely will fail at some point.
I'll put it in, I don't care anymore.
--
Pavel Begunkov