Re: Possible race condition on xsk_socket__create/xsk_bind

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019-12-06 00:21, William Tu wrote:
Hi,

While testing XSK using OVS, we hit an issue when create xsk,
destroy xsk, create xsk in a short time window.
The call to xsk_socket__create returns EBUSY due to
   xsk_bind
     xdp_umem_assign_dev
       xdp_get_umem_from_qid --> return EBUSY

I found that when everything works, the sequence is
   <ovs creates xsk>
   xsk_bind
     xdp_umem_assign_dev
   <ovs destroy xsk> ...
   xsk_release
   xsk_destruct
     xdp_umem_release_deferred
       xdp_umem_release
         xdp_umem_clear_dev --> avoid the error above

But sometimes xsk_destruct has not yet called, the
next call to xsk_bind shows up, ex:

   <ovs creates xsk>
   xsk_bind
     xdp_umem_assign_dev
   <ovs destroy xsk> ...
   xsk_release
   xsk_bind
     xdp_umem_assign_dev
       xdp_get_umem_from_qid (failed!)
   ....
   xsk_destruct

Is there a way to make sure the previous xsk is fully cleanup,
so we can safely call xsk_socket__create()?


Yes, the async cleanup is annoying. I *think* it can be done synchronous, since the map doesn't linger on a sockref anymore -- 0402acd683c6 ("xsk: remove AF_XDP socket from map when the socket is released").

So, it's not a race, it just asynch. :-(

I'll take a stab at fixing this!


Cheers,
Björn


The error is reproduced by OVS using:
ovs-vsctl -- set interface afxdp-p0 options:n_rxq=1 type="afxdp" options:xdp-mode=native ovs-vsctl -- set interface afxdp-p0 options:n_rxq=1 type="afxdp" options:xdp-mode=generic ovs-vsctl -- set interface afxdp-p0 options:n_rxq=1 type="afxdp" options:xdp-mode=native
This just keeps create and destroy xsk on the same device.

Thanks
William



[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux