A device sending 0 length frames as fast as it can has been observed killing the host system due to the resulting memory pressure. Temporarily disable RX skb allocation and URB submission when the current error ratio is high, preventing us from trying to allocate an infinite number of skbs. Reenable as soon as we are finished processing the done queue, allowing the device to continue working after short error bursts. Signed-off-by: Bjørn Mork <bjorn@xxxxxxx> --- v2 changes: - reset counters in open to avoid starting in previous error state - remove superfluous debug logging. Errors are logged per packet when debugging is on anyway. The removal of the debug logging should address your concerns, Joe? Note that I have not added any new device reset or the like, Oliver. It does not seem to help for the device I am testing. And being a composite device, where this bug only affects *one* function, I do not think it would be appropriate even if it did help. Note that the device may be configured with multiple usbnet functions as well, and that the bug only hits one of them. All other functions work fine, as long as this patch protects the host. Noting that since the bug triggers when coming out of CDC MBIM mode, I also tried using the MBIM class specific RESET_FUNCTION request before leaving MBIM mode. But this didn't have any effect either. Just FYI. As it is, I do not see any way we can do any sensible error handling here. Let us concentrate on making usbnet survive the flood and leave further error handling to the user. Bjørn drivers/net/usb/usbnet.c | 25 +++++++++++++++++++++++++ include/linux/usb/usbnet.h | 2 ++ 2 files changed, 27 insertions(+) diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index f34b2eb..9778377 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -380,6 +380,12 @@ static int rx_submit (struct usbnet *dev, struct urb *urb, gfp_t flags) unsigned long lockflags; size_t size = dev->rx_urb_size; + /* prevent rx skb allocation when error ratio is high */ + if (test_bit(EVENT_RX_KILL, &dev->flags)) { + usb_free_urb(urb); + return -ENOLINK; + } + skb = __netdev_alloc_skb_ip_align(dev->net, size, flags); if (!skb) { netif_dbg(dev, rx_err, dev->net, "no rx skb\n"); @@ -539,6 +545,17 @@ block: break; } + /* stop rx if packet error rate is high */ + if (++dev->pkt_cnt > 30) { + dev->pkt_cnt = 0; + dev->pkt_err = 0; + } else { + if (state == rx_cleanup) + dev->pkt_err++; + if (dev->pkt_err > 20) + set_bit(EVENT_RX_KILL, &dev->flags); + } + state = defer_bh(dev, skb, &dev->rxq, state); if (urb) { @@ -791,6 +808,11 @@ int usbnet_open (struct net_device *net) (dev->driver_info->flags & FLAG_FRAMING_AX) ? "ASIX" : "simple"); + /* reset rx error state */ + dev->pkt_cnt = 0; + dev->pkt_err = 0; + clear_bit(EVENT_RX_KILL, &dev->flags); + // delay posting reads until we're fully open tasklet_schedule (&dev->bh); if (info->manage_power) { @@ -1254,6 +1276,9 @@ static void usbnet_bh (unsigned long param) } } + /* restart RX again after disabling due to high error rate */ + clear_bit(EVENT_RX_KILL, &dev->flags); + // waiting for all pending urbs to complete? if (dev->wait) { if ((dev->txq.qlen + dev->rxq.qlen + dev->done.qlen) == 0) { diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h index 5de7a22..0de078d 100644 --- a/include/linux/usb/usbnet.h +++ b/include/linux/usb/usbnet.h @@ -33,6 +33,7 @@ struct usbnet { wait_queue_head_t *wait; struct mutex phy_mutex; unsigned char suspend_count; + unsigned char pkt_cnt, pkt_err; /* i/o info: pipes etc */ unsigned in, out; @@ -70,6 +71,7 @@ struct usbnet { # define EVENT_DEV_OPEN 7 # define EVENT_DEVICE_REPORT_IDLE 8 # define EVENT_NO_RUNTIME_PM 9 +# define EVENT_RX_KILL 10 }; static inline struct usb_driver *driver_of(struct usb_interface *intf) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html