Re: [PATCH] ipoib: clean ib tx ring periodically

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2017-03-01 at 09:28 +0200, Erez Shitrit wrote:
> On Thu, Feb 16, 2017 at 5:35 PM, Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
> > The skbs transmitted via ipoib_send() are freed only if there are
> > 16 or more outstanding work requests or if the send queue is full.
> > 
> > If there is very little networking activity, the transmitted skbs
> > can be held by the device driver for an unlimited amount of time,
> > starving other subsystems.
> > 
> > E.g. assuming the ipv6 is enabled, with the following sequence:
> > 
> > systemctl start firewalld
> > modprobe ib_ipoib
> > ip addr add dev ib0 fc00::1/64
> > systemctl stop firewalld
> > 
> > a cpu will hang: rmmod conntrack will keep a core busy
> > spinning for nf_conntrack_untracked going to 0, since some ICMP6
> > ND packets are generated and transmitted when the ipv6 address
> > is attached to the device, and such packets get a notrack ct
> > entry.
> > 
> > This change address the issue introducing a periodic timer performing
> > "garbage collection" on the send ring at low frequency (once every
> > second).
> > 
> > This new timer runs independently from the currently used poll_timer,
> > so that no additional delay is introduced to clean the ring after
> > errors or ring full event.
> 
> Hi,
> 
> Adding a new timer is not the required solution, it is a w/a over the
> TX part in the ipoib driver.
> The real solution, IMHO, is to use the napi mechanism for the TX in a
> similar way as it done in the RX. (as it done in many network drivers)
> 
> We (Mellanox) are planning to send such solution in the next few days.

Thank you for jumping-in on this.

I think that the tx napi polling implementation for the ipoib driver is
not so straight-forward because, afaics, the ib completion callback is
intentionally avoided for tx - unless in exceptional scenarios -
possibly for performance reason.

Anyway, if you can fix this in a cleaner way, I'll be more than happy. 

Thank you,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux