Hi! > [ Upstream commit be6cef69ba570ebb327eba1ef6438f7af49aaf86 ] > > On embedded environments with hard memory limits it is a normal although > rare case when skb can't be allocated on rx part under high traffic. > > In such OOM cases napi_complete_done() was not called. > So the napi object became in an invalid state like it is "scheduled". > Kernel do not re-schedules the poll of that napi object. > > Consequently, kernel can not remove that object the system hangs on > `ifconfig down` waiting for a poll. > > We are fixing this by gracefully closing napi poll routine with correct > invocation of napi_complete_done. > > This was reproduced with artificially failing the allocation of skb to > simulate an "out of memory" error case and check that traffic does > not get stuck. > --- a/drivers/net/ethernet/aquantia/atlantic/aq_vec.c > +++ b/drivers/net/ethernet/aquantia/atlantic/aq_vec.c > @@ -89,6 +89,7 @@ static int aq_vec_poll(struct napi_struct *napi, int budget) > } > } > > +err_exit: > if (!was_tx_cleaned) > work_done = budget; > This results in some... really "interesting" code that could use some refactoring. First, "goto err_exit" is now same as break. Second, if (!self) now sets variable that is never used. "if (!self) return 0;" would be more readable and would allow for less confusing indentation. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Attachment:
signature.asc
Description: Digital signature