Re: [PATCH] net/skbuff: silence warnings under memory pressure

Eric Dumazet <eric.dumazet@xxxxxxxxx> · Fri, 30 Aug 2019 18:15:22 +0200

On 8/30/19 5:25 PM, Qian Cai wrote:
> On Fri, 2019-08-30 at 17:11 +0200, Eric Dumazet wrote:
>>
>> On 8/30/19 4:57 PM, Qian Cai wrote:
>>> When running heavy memory pressure workloads, the system is throwing
>>> endless warnings below due to the allocation could fail from
>>> __build_skb(), and the volume of this call could be huge which may
>>> generate a lot of serial console output and cosumes all CPUs as
>>> warn_alloc() could be expensive by calling dump_stack() and then
>>> show_mem().
>>>
>>> Fix it by silencing the warning in this call site. Also, it seems
>>> unnecessary to even print a warning at all if the allocation failed in
>>> __build_skb(), as it may just retransmit the packet and retry.
>>>
>>
>> Same patches are showing up there and there from time to time.
>>
>> Why is this particular spot interesting, against all others not adding
>> __GFP_NOWARN ?
>>
>> Are we going to have hundred of patches adding __GFP_NOWARN at various points,
>> or should we get something generic to not flood the syslog in case of memory
>> pressure ?
>>
> 
> From my testing which uses LTP oom* tests. There are only 3 places need to be
> patched. The other two are in IOMMU code for both Intel and AMD. The place is
> particular interesting because it could cause the system with floating serial
> console output for days without making progress in OOM. I suppose it ends up in
> a looping condition that warn_alloc() would end up generating more calls into
> __build_skb() via ksoftirqd.
> 

Yes, but what about other tests done by other people ?

You do not really answer my last question, which was really the point I tried
to make.

If there is a risk of flooding the syslog, we should fix this generically
in mm layer, not adding hundred of __GFP_NOWARN all over the places.

Maybe just make __GFP_NOWARN the default, I dunno.