Patch solves following problems: 1) Forgotten counter incrementation in sis900_rx() in case it doesn't get memory for skb, that leads to whole interface failure. Problem is accompanied with messages: eth0: Memory squeeze,deferring packet. eth0: NULL pointer encountered in Rx ring, skipping 2) If counter cur_rx overflows and there'll be temporary memory problems buffer can't be recreated later, when memory IS avaliable. 3) Limit the work in handler to prevent the endless packets processing if new packets are generated faster then handled. Signed-off-by: Konstantin Khorenko <khorenko@xxxxx> Signed-off-by: Vasily Averin <vvs@xxxxx> ----------------------------------- We had a customer that complains about the problem with network card that is supported by sis900 driver. Problem description: at random time card suddenly stops working and only reboot makes it back to work. Non-working is accomplished with massages in /var/log/messages: eth0: Memory squeeze,deferring packet. eth0: NULL pointer encountered in Rx ring, skipping eth0: NULL pointer encountered in Rx ring, skipping eth0: NULL pointer encountered in Rx ring, skipping (till reboot) We discover that his problem is already known: http://www.ussg.iu.edu/hypermail/linux/kernel/0407.3/0566.html http://www.kernelnewbies.org/documents/kdoc/sis900/problems.html Nevertheless it isn't fixed till now, so we tried to fix. (1) Function sis900_rx(). During normal execution dirty_rx < cur_rx is ALWAYS true. Let's assume, we are short of memory. unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC; ... while (rx_status & OWN) { ... if (some error check) { ... } else { 5. Next func call, after previous one we have rx_skbuff[cur_rx%] == NULL, which means rx_skbuff[entry] == NULL if (sis_priv->rx_skbuff[entry] == NULL) { printk(KERN_INFO "%s: NULL pointer " "encountered in Rx ring, skipping\n", net_dev->name); 6. Print and exit while() loop. break; } ... 1. fail here. --> if ((skb = dev_alloc_skb(RX_BUF_SIZE)) == NULL) { ... printk(KERN_INFO "%s: Memory squeeze," "deferring packet.\n", net_dev->name); --> sis_priv->rx_skbuff[entry] = NULL; 2. now sis_priv->rx_skbuff[cur_rx%] == NULL ... break; 3. and we are exiting while () not incrementing cur_rx! } ... } // of else sis_priv->cur_rx++; entry = sis_priv->cur_rx % NUM_RX_DESC; ... } //of while 4. we refill all buffers rx_skbuff[entry], where entry < cur_rx. rx_skbuff[cur_rx%] == NULL before and AFTER loop for (; sis_priv->cur_rx > sis_priv->dirty_rx; sis_priv->dirty_rx++) { entry = sis_priv->dirty_rx % NUM_RX_DESC; if (sis_priv->rx_skbuff[entry] == NULL) { ... sis_priv->rx_skbuff[entry] = skb; ... } } No matter how many times func is called cur_rx won't be incremented, and thus rx_skbuff[cur_rx%] will be NULL forever, which results neverending printings and packets drops. ------ (2) The same function sis900_rx(). for (; sis_priv->cur_rx > sis_priv->dirty_rx; sis_priv->dirty_rx++) { entry = sis_priv->dirty_rx % NUM_RX_DESC; if (sis_priv->rx_skbuff[entry] == NULL) { ...skb = dev_alloc_skb(RX_BUF_SIZE)... ... sis_priv->rx_skbuff[entry] = skb; ... } } Assume cur_rx is overflowed in previous while() loop execution, but dirty_rx is NOT and we really need buffer refilling. Comparison sis_priv->cur_rx > sis_priv->dirty_rx will fail and buffers won't be refilled. ---------- (3) The same function sis900_rx(). Assume whole buffer is filled, there is no memory shortage problem and network card receives packets faster then kernel process them in this sis900_rx() function in while (rx_status & OWN) loop - execution control won't leave the loop. sis900_rx() is called in interrupt handler, it's not good idea to `do "too much" work here` (sentence from sources :) ) ---------- Hope, you'll check this changes and find them usefull. :) Kernels with patches compile but untested. This patch is against mainstream 2.4.31 kernel. -- Best regards, Konstantin Khorenko, SWsoft, Inc.
--- ./drivers/net/sis900.c.sis900 2004-08-08 03:26:05.000000000 +0400 +++ ./drivers/net/sis900.c 2005-09-16 17:17:27.000000000 +0400 @@ -1613,6 +1613,14 @@ static int sis900_rx(struct net_device * long ioaddr = net_dev->base_addr; unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC; u32 rx_status = sis_priv->rx_ring[entry].cmdsts; + /* + * If cur > dirty, then limit = NUM_RX_DESC - cur + dirty = + * NUM_RX_DESC + (dirty - cur) + * If cur < dirty (cur overflowed, dirty - not), then + * limit = dirty - cur + */ + int rx_work_limit = + (sis_priv->dirty_rx - sis_priv->cur_rx) % NUM_RX_DESC; if (sis900_debug > 3) printk(KERN_INFO "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d " @@ -1622,6 +1630,8 @@ static int sis900_rx(struct net_device * while (rx_status & OWN) { unsigned int rx_size; + if (--rx_work_limit < 0) + break; rx_size = (rx_status & DSIZE) - CRC_SIZE; if (rx_status & (ABORT|OVERRUN|TOOLONG|RUNT|RXISERR|CRCERR|FAERR)) { @@ -1688,6 +1698,7 @@ static int sis900_rx(struct net_device * sis_priv->rx_ring[entry].cmdsts = 0; sis_priv->rx_ring[entry].bufptr = 0; sis_priv->stats.rx_dropped++; + sis_priv->cur_rx++; break; } skb->dev = net_dev; @@ -1705,7 +1716,7 @@ static int sis900_rx(struct net_device * /* refill the Rx buffer, what if the rate of refilling is slower than consuming ?? */ - for (;sis_priv->cur_rx - sis_priv->dirty_rx > 0; sis_priv->dirty_rx++) { + for (; sis_priv->cur_rx != sis_priv->dirty_rx; sis_priv->dirty_rx++) { struct sk_buff *skb; entry = sis_priv->dirty_rx % NUM_RX_DESC; # # Patch solves following problems: # 1) Forgotten counter incrementation in sis900_rx() in case # it doesn't get memory for skb, that leads to whole interface failure. # Problem is accompanied with messages: # eth0: Memory squeeze,deferring packet. # eth0: NULL pointer encountered in Rx ring, skipping # 2) If counter cur_rx overflows and there'll be temporary memory problems # buffer can't be recreated later, when memory IS avaliable. # 3) Limit the work in handler to prevent the endless packets processing if # new packets are generated faster then handled. # # Signed-off-by: Konstantin Khorenko <khorenko@xxxxx> # Signed-off-by: Vasily Averin <vvs@xxxxx>