Re: [Bug #14141] order 2 page allocation failures in iwlagn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday 02 October 2009, Frans Pop wrote:
> On Thursday 01 October 2009, Rafael J. Wysocki wrote:
> > Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=14141
> > Subject       : order 2 page allocation failures in iwlagn
> > Submitter     : Frans Pop <elendil@xxxxxxxxx>
> > Date          : 2009-09-06 7:40 (26 days old)
> > References    : http://marc.info/?l=linux-kernel&m=125222287419691&w=4
> > Handled-By    : Pekka Enberg <penberg@xxxxxxxxxxxxxx>
>
> I'm not sure about this.
>
> The error messages from failed allocations should now be a lot less as a
> result of this commit:
> commit f82a924cc88a5541df1d4b9d38a0968cd077a051
> Author: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> Date:   Thu Sep 17 10:43:56 2009 -0700
>     iwlwifi: reduce noise when skb allocation fails
>
> That commit is in mainline, and I'm not sure if it is important enough
> for a stable update (AFAICT it's not listed for 2.6.31.2).
>
> That commit is mostly cosmetic, but possibly the real regression is not
> in iwlagn but in the way memory is freed/defragmented. That aspect was
> also reported by Bartlomiej (#14016) and was extensively discussed
> (without a clear conclusion) here: http://lkml.org/lkml/2009/8/26/140.

I may be getting somewhere with this. I just got the allocation failures
included below on .32-rc3. Note that these are not the "fixable" failures
that got suppressed with the commit referenced above, but the "this could
affect networking" failures that are still reported.

What I was doing when I got them is also interesting:
- a kernel build
- a gitk for the kernel tree (with full history this uses ~40% of memory)
- by mistake I then started a _second_ gitk

The second gitk (which shows as 'wish8.5' in top) caused a massive swap
out which brought the system to a standstill for a while (with huge
latencies as well, including a completely stuck mouse cursor, which
happens rarely).
The system has 2GB RAM + 2GB swap, so IIUC there is no danger of getting
into an OOM as the first gitk can be swapped out completely.

I'll dig into this a bit more as it looks like this should be reproducible,
probably even without the kernel build. Next step is to see how .30 behaves
in the same situation.

Even if it is reproducible with .30, I wonder if the kernel shouldn't be
more robust in this situation. Currently it seems to allow one single
process to claim so much memory before swapping out that "normal" operation
of other processes is affected. I can understand that such a situation may
be hard to avoid on a very busy system where multiple processes start
claiming (a lot of) memory at roughly the same time, but I'd say it should
be avoidable if a single process is the culprit.

BTW, the system recovered completely, although that took some time (the
first gitk remained visible in top long after I closed its window; I think
because the system was busy swapping it back in before terminating it).

Cheers,
FJP

kcryptd: page allocation failure. order:2, mode:0x4020
Pid: 1483, comm: kcryptd Not tainted 2.6.32-rc3 #22
Call Trace:
 <IRQ>  [<ffffffff8107c3d5>] __alloc_pages_nodemask+0x5a2/0x5ec
 [<ffffffff81264892>] ? _spin_unlock+0x9/0xb
 [<ffffffff811e73cd>] ? __alloc_skb+0x3c/0x15b
 [<ffffffffa03202cb>] ? iwl_rx_allocate+0x8f/0x305 [iwlcore]
 [<ffffffff8107c431>] __get_free_pages+0x12/0x41
 [<ffffffff8109cb1a>] __kmalloc_track_caller+0x3b/0xed
 [<ffffffff811e73f7>] __alloc_skb+0x66/0x15b
 [<ffffffffa03202cb>] iwl_rx_allocate+0x8f/0x305 [iwlcore]
 [<ffffffffa0320557>] iwl_rx_replenish_now+0x16/0x23 [iwlcore]
 [<ffffffffa035c0c8>] iwl_rx_handle+0x3a8/0x3c1 [iwlagn]
 [<ffffffff81051add>] ? sched_clock_local+0x1c/0x80
 [<ffffffffa035c60d>] iwl_irq_tasklet_legacy+0x52c/0x7a4 [iwlagn]
 [<ffffffffa0317aaf>] ? __iwl_read32+0xa5/0xb4 [iwlcore]
 [<ffffffff8103efb8>] tasklet_action+0x71/0xbc
 [<ffffffff8103f837>] __do_softirq+0x96/0x11b
 [<ffffffff8100cabc>] call_softirq+0x1c/0x28
 [<ffffffff8100e5ef>] do_softirq+0x33/0x6b
 [<ffffffff8103f5c5>] irq_exit+0x36/0x75
 [<ffffffff8100dcf1>] do_IRQ+0xa3/0xba
 [<ffffffff8100c353>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff811199dd>] ? scatterwalk_start+0x11/0x19
 [<ffffffff8111bbca>] ? blkcipher_walk_first+0x173/0x196
 [<ffffffff8111b67b>] ? blkcipher_walk_done+0xe6/0x1b8
 [<ffffffff8111bc35>] ? blkcipher_walk_virt+0x1a/0x1d
 [<ffffffffa02001cf>] ? crypto_cbc_encrypt+0x43/0x18e [cbc]
 [<ffffffff81127efd>] ? blk_recount_segments+0x1b/0x2c
 [<ffffffffa021371e>] ? aes_encrypt+0x0/0xf [aes_x86_64]
 [<ffffffff8111af64>] ? async_encrypt+0x38/0x3a
 [<ffffffffa01f7b54>] ? crypt_convert+0x1f9/0x28b [dm_crypt]
 [<ffffffffa01f8009>] ? kcryptd_crypt+0x423/0x449 [dm_crypt]
 [<ffffffffa01f7be6>] ? kcryptd_crypt+0x0/0x449 [dm_crypt]
 [<ffffffff81049bfd>] ? worker_thread+0x146/0x1d8
 [<ffffffff8104d706>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff81049ab7>] ? worker_thread+0x0/0x1d8
 [<ffffffff8104d3f4>] ? kthread+0x7d/0x85
 [<ffffffff8100c9ba>] ? child_rip+0xa/0x20
 [<ffffffff8104d377>] ? kthread+0x0/0x85
 [<ffffffff8100c9b0>] ? child_rip+0x0/0x20
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd: 171
CPU    1: hi:  186, btch:  31 usd: 177
active_anon:298532 inactive_anon:100163 isolated_anon:52
 active_file:3993 inactive_file:4001 isolated_file:12
 unevictable:399 dirty:0 writeback:76102 unstable:0 buffer:125
 free:14107 slab_reclaimable:4510 slab_unreclaimable:20421
 mapped:7949 shmem:0 pagetables:4437 bounce:0
DMA free:7928kB min:40kB low:48kB high:60kB active_anon:3340kB inactive_anon:3608kB active_file:384kB
   inactive_file:472kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15336kB mlocked:0kB
   dirty:0kB writeback:80kB mapped:256kB shmem:0kB slab_reclaimable:12kB slab_unreclaimable:104kB kernel_stack:0kB
   pagetables:16kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 1976 1976 1976
DMA32 free:48500kB min:5664kB low:7080kB high:8496kB active_anon:1190788kB inactive_anon:397044kB active_file:15588kB
   inactive_file:15532kB unevictable:1596kB isolated(anon):208kB isolated(file):48kB present:2023748kB mlocked:1596kB
   dirty:0kB writeback:304328kB mapped:31540kB shmem:0kB slab_reclaimable:18028kB slab_unreclaimable:81496kB kernel_stack:1672kB
   pagetables:17732kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 19*4kB 13*8kB 3*16kB 7*32kB 11*64kB 11*128kB 5*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 7940kB
DMA32: 9299*4kB 1341*8kB 4*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 48500kB
98572 total pagecache pages
90213 pages in swap cache
Swap cache stats: add 175874, delete 85661, find 7850/8731
Free swap  = 1425944kB
Total swap = 2097144kB
518064 pages RAM
10350 pages reserved
82388 pages shared
437481 pages non-shared
iwlagn 0000:10:00.0: Failed to allocate SKB buffer with GFP_ATOMIC. Only 0 free buffers remaining.
swapper: page allocation failure. order:2, mode:0x4020
Pid: 0, comm: swapper Not tainted 2.6.32-rc3 #22
Call Trace:
 <IRQ>  [<ffffffff8107c3d5>] __alloc_pages_nodemask+0x5a2/0x5ec
 [<ffffffff81264892>] ? _spin_unlock+0x9/0xb
 [<ffffffff811e73cd>] ? __alloc_skb+0x3c/0x15b
 [<ffffffffa03202cb>] ? iwl_rx_allocate+0x8f/0x305 [iwlcore]
 [<ffffffff8107c431>] __get_free_pages+0x12/0x41
 [<ffffffff8109cb1a>] __kmalloc_track_caller+0x3b/0xed
 [<ffffffff811e73f7>] __alloc_skb+0x66/0x15b
 [<ffffffffa03202cb>] iwl_rx_allocate+0x8f/0x305 [iwlcore]
 [<ffffffffa0320557>] iwl_rx_replenish_now+0x16/0x23 [iwlcore]
 [<ffffffffa035c0c8>] iwl_rx_handle+0x3a8/0x3c1 [iwlagn]
 [<ffffffffa035c60d>] iwl_irq_tasklet_legacy+0x52c/0x7a4 [iwlagn]
 [<ffffffffa0317aaf>] ? __iwl_read32+0xa5/0xb4 [iwlcore]
 [<ffffffff8103efb8>] tasklet_action+0x71/0xbc
 [<ffffffff8103f837>] __do_softirq+0x96/0x11b
 [<ffffffff8100cabc>] call_softirq+0x1c/0x28
 [<ffffffff8100e5ef>] do_softirq+0x33/0x6b
 [<ffffffff8103f5c5>] irq_exit+0x36/0x75
 [<ffffffff8100dcf1>] do_IRQ+0xa3/0xba
 [<ffffffff8100c353>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffffa0278ec7>] ? acpi_idle_enter_simple+0xf9/0x127 [processor]
 [<ffffffffa0278ebd>] ? acpi_idle_enter_simple+0xef/0x127 [processor]
 [<ffffffff811da545>] ? cpuidle_idle_call+0x8c/0xc7
 [<ffffffff8100ae2e>] ? cpu_idle+0x55/0x8d
 [<ffffffff8125432d>] ? rest_init+0x61/0x63
 [<ffffffff81436c3e>] ? start_kernel+0x348/0x353
 [<ffffffff8143629a>] ? x86_64_start_reservations+0xaa/0xae
 [<ffffffff8143637f>] ? x86_64_start_kernel+0xe1/0xe8
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd: 171
CPU    1: hi:  186, btch:  31 usd: 155
active_anon:297901 inactive_anon:99948 isolated_anon:52
 active_file:3920 inactive_file:3948 isolated_file:12
 unevictable:399 dirty:0 writeback:34634 unstable:0 buffer:125
 free:23390 slab_reclaimable:4510 slab_unreclaimable:11714
 mapped:7819 shmem:0 pagetables:4437 bounce:0
DMA free:7908kB min:40kB low:48kB high:60kB active_anon:3340kB inactive_anon:3608kB active_file:384kB
   inactive_file:472kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15336kB mlocked:0kB
   dirty:0kB writeback:36kB mapped:256kB shmem:0kB slab_reclaimable:12kB slab_unreclaimable:12kB kernel_stack:0kB
   pagetables:16kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 1976 1976 1976
DMA32 free:85652kB min:5664kB low:7080kB high:8496kB active_anon:1188264kB inactive_anon:396184kB active_file:15296kB
   inactive_file:15320kB unevictable:1596kB isolated(anon):208kB isolated(file):48kB present:2023748kB mlocked:1596kB
   dirty:0kB writeback:138500kB mapped:31020kB shmem:0kB slab_reclaimable:18028kB slab_unreclaimable:46844kB
   kernel_stack:1672kB pagetables:17732kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 17*4kB 12*8kB 4*16kB 6*32kB 11*64kB 11*128kB 5*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 7908kB
DMA32: 12419*4kB 4439*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 85652kB
97616 total pagecache pages
89394 pages in swap cache
Swap cache stats: add 175906, delete 86512, find 7850/8733
Free swap  = 1425864kB
Total swap = 2097144kB
518064 pages RAM
10350 pages reserved
82282 pages shared
428383 pages non-shared
iwlagn 0000:10:00.0: Failed to allocate SKB buffer with GFP_ATOMIC. Only 0 free buffers remaining.
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux