Linux 3.18 and Linux 4.1 reproducible OOM crashes under Xen

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The Xen Project CI system (`osstest') reports that the current tip of
the Linux 3.18 and Linux 4.1 stable branches have serious problems
booting Xen.

Our automatic bisector has completed its work and fingered (in 3.18)
a2d8c5147532 "mm/swap.c: flush lru pvecs on compound page arrival"
(and a backport of what seems to be the same commit to Linux 4.1).

The failure as accompanied by lots of oom killer activity.

For the full logs for a repro, see

  http://logs.test-lab.xenproject.org/osstest/logs/97435/test-amd64-amd64-xl-credit2/info.html

See particularly `serial-chardonnay0.log', near `Jul 16 14:16:32'.

More details below, including:

  - serial log extract from an earlier test failure
      (for our reference, flight 97278 job test-amd64-amd64-xl)

  - report from the osstest bisector

Thanks for your attention.

Ian.


Jul 14 00:27:33.715871 [   22.813518] rc.local invoked oom-killer: gfp_mask=0x84d0, order=0, oom_score_adj=0

Jul 14 00:27:33.899843 [   22.813541] rc.local cpuset=/ mems_allowed=0

Jul 14 00:27:33.907821 [   22.813550] CPU: 0 PID: 2676 Comm: rc.local Not tainted 3.18.37 #1

Jul 14 00:27:33.907861 [   22.813556] Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS S1200BT.86B.02.00.0042.050820141549 05/08/2014

Jul 14 00:27:33.923914 [   22.813565]  0000000000000000 ffff8800024b7968 ffffffff817dcab5 00000000000084d0

Jul 14 00:27:33.931995 [   22.813573]  0000000000000000 ffff8800024b79c8 ffffffff8118082f ffff8800024b7988

Jul 14 00:27:33.940021 [   22.813582]  ffffffff8114c315 ffffffff817e5ee9 0000000000000001 ffff8800024b79c8

Jul 14 00:27:33.948034 [   22.813590] Call Trace:

Jul 14 00:27:33.948065 [   22.813600]  [<ffffffff817dcab5>] dump_stack+0x7c/0x98

Jul 14 00:27:33.955838 [   22.813609]  [<ffffffff8118082f>] dump_header.isra.11+0x8f/0x1e0

Jul 14 00:27:33.963831 [   22.813616]  [<ffffffff8114c315>] ? __delayacct_freepages_end+0x45/0x50

Jul 14 00:27:33.963872 [   22.813677]  [<ffffffff817e5ee9>] ? _raw_spin_unlock_irqrestore+0x29/0x90

Jul 14 00:27:33.971868 [   22.813685]  [<ffffffff8131f93e>] ? ___ratelimit+0xae/0x160

Jul 14 00:27:33.979838 [   22.813691]  [<ffffffff81180ddc>] oom_kill_process+0x20c/0x370

Jul 14 00:27:33.987847 [   22.813699]  [<ffffffff810bbc69>] ? has_capability_noaudit+0x19/0x20

Jul 14 00:27:33.987888 [   22.813706]  [<ffffffff811812d1>] out_of_memory+0x211/0x330

Jul 14 00:27:33.995847 [   22.813714]  [<ffffffff81186331>] __alloc_pages_nodemask+0xb11/0xb50

Jul 14 00:27:34.003844 [   22.813721]  [<ffffffff81186382>] __get_free_pages+0x12/0x70

Jul 14 00:27:34.011845 [   22.813728]  [<ffffffff811863f1>] get_zeroed_page+0x11/0x20

Jul 14 00:27:34.011882 [   22.813735]  [<ffffffff811a9c31>] __pud_alloc+0x21/0x120

Jul 14 00:27:34.019841 [   22.813742]  [<ffffffff811ab312>] handle_mm_fault+0x352/0xd50

Jul 14 00:27:34.027863 [   22.813749]  [<ffffffff811a550f>] ? follow_page_mask+0x2f/0x4f0

Jul 14 00:27:34.035848 [   22.813756]  [<ffffffff811af392>] ? find_vma+0x62/0x70

Jul 14 00:27:34.035885 [   22.813763]  [<ffffffff811a5b48>] __get_user_pages+0x178/0x660

Jul 14 00:27:34.043839 [   22.813770]  [<ffffffff811a607d>] get_user_pages+0x4d/0x50

Jul 14 00:27:34.051847 [   22.813776]  [<ffffffff811db546>] copy_strings.isra.26+0x176/0x310

Jul 14 00:27:34.051886 [   22.813782]  [<ffffffff811db71a>] copy_strings_kernel+0x3a/0x60

Jul 14 00:27:34.059844 [   22.813789]  [<ffffffff811dcad8>] do_execve_common.isra.33+0x418/0x680

Jul 14 00:27:34.067846 [   22.813795]  [<ffffffff811dd014>] SyS_execve+0x24/0x30

Jul 14 00:27:34.075839 [   22.813801]  [<ffffffff817e6b19>] stub_execve+0x69/0xa0

Jul 14 00:27:34.075876 [   22.813807] Mem-Info:

Jul 14 00:27:34.083841 [   22.813810] DMA per-cpu:

Jul 14 00:27:34.083872 [   22.813813] CPU    0: hi:    0, btch:   1 usd:   0

Jul 14 00:27:34.083905 [   22.813817] CPU    1: hi:    0, btch:   1 usd:   0

Jul 14 00:27:34.091848 [   22.813821] CPU    2: hi:    0, btch:   1 usd:   0

Jul 14 00:27:34.099849 [   22.813825] CPU    3: hi:    0, btch:   1 usd:   0

Jul 14 00:27:34.099885 [   22.813829] DMA32 per-cpu:

Jul 14 00:27:34.107837 [   22.813833] CPU    0: hi:  186, btch:  31 usd:  88

Jul 14 00:27:34.107872 [   22.813837] CPU    1: hi:  186, btch:  31 usd:  72

Jul 14 00:27:34.115855 [   22.813841] CPU    2: hi:  186, btch:  31 usd:  47

Jul 14 00:27:34.115892 [   22.813845] CPU    3: hi:  186, btch:  31 usd: 171

Jul 14 00:27:34.123844 [   22.813852] active_anon:0 inactive_anon:0 isolated_anon:0

Jul 14 00:27:34.131837 [   22.813852]  active_file:0 inactive_file:0 isolated_file:0

Jul 14 00:27:34.131873 [   22.813852]  unevictable:69 dirty:9 writeback:0 unstable:0

Jul 14 00:27:34.139847 [   22.813852]  free:1036 slab_reclaimable:1887 slab_unreclaimable:4091

Jul 14 00:27:34.147839 [   22.813852]  mapped:4099 shmem:154 pagetables:706 bounce:0

Jul 14 00:27:34.155836 [   22.813852]  free_cma:0

Jul 14 00:27:34.155867 [   22.813875] DMA free:1680kB min:96kB low:120kB high:144kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:1048kB dirty:0kB writeback:0kB mapped:408kB shmem:8kB slab_reclaimable:92kB slab_unreclaimable:328kB kernel_stack:64kB pagetables:148kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

Jul 14 00:27:34.195860 [   22.813896] lowmem_reserve[]: 0 396 396 396

Jul 14 00:27:34.195898 [   22.813908] DMA32 free:2464kB min:2496kB low:3120kB high:3744kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:272kB isolated(anon):0kB isolated(file):0kB present:508308kB managed:408952kB mlocked:32180kB dirty:36kB writeback:0kB mapped:15988kB shmem:608kB slab_reclaimable:7456kB slab_unreclaimable:16036kB kernel_stack:2720kB pagetables:2676kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

Jul 14 00:27:34.243847 [   22.813929] lowmem_reserve[]: 0 0 0 0

Jul 14 00:27:34.243882 [   22.813936] DMA: 0*4kB 0*8kB 1*16kB (R) 0*32kB 0*64kB 1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1680kB

Jul 14 00:27:34.251860 [   22.813956] DMA32: 0*4kB 0*8kB 1*16kB (R) 1*32kB (R) 0*64kB 1*128kB (R) 1*256kB (R) 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB = 2480kB

Jul 14 00:27:34.267804 [   22.813976] 15037 total pagecache pages

Jul 14 00:27:34.267825 [   22.813980] 0 pages in swap cache

Jul 14 00:27:34.275851 [   22.813984] Swap cache stats: add 0, delete 0, find 0/0

Jul 14 00:27:34.283845 [   22.813988] Free swap  = 1949692kB

Jul 14 00:27:34.283879 [   22.813991] Total swap = 1949692kB

Jul 14 00:27:34.283908 [   22.813994] 131071 pages RAM

Jul 14 00:27:34.291912 [   22.813997] 0 pages HighMem/MovableOnly

Jul 14 00:27:34.291946 [   22.814000] 24860 pages reserved

Jul 14 00:27:34.299797 [   22.814003] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name

Jul 14 00:27:34.307798 [   22.814013] [ 1316]     0  1316     9825      819      21        0         -1000 udevd

Jul 14 00:27:34.315792 [   22.814021] [ 2038]     0  2038     1084       25       7        0             0 ntpdate

Jul 14 00:27:34.323800 [   22.814027] [ 2040]     0  2040     1569      172       9        0             0 lockfile-create

Jul 14 00:27:34.331829 [   22.814034] [ 2047]     0  2047     1060      331       7        0             0 startpar

Jul 14 00:27:34.339797 [   22.814040] [ 2061]     0  2061     9270      668      24        0             0 rpcbind

Jul 14 00:27:34.347796 [   22.814047] [ 2076]   106  2076     9320      700      22        0             0 rpc.statd

Jul 14 00:27:34.355808 [   22.814054] [ 2088]     0  2088     6892       58      18        0             0 rpc.idmapd

Jul 14 00:27:34.363800 [   22.814060] [ 2131]     0  2131     1084      410       7        0             0 rc

Jul 14 00:27:34.371807 [   22.814066] [ 2137]     0  2137     3202      648      13        0             0 startpar

Jul 14 00:27:34.379805 [   22.814073] [ 2160]     0  2160    65721      788      33        0             0 rsyslogd

Jul 14 00:27:34.387805 [   22.814079] [ 2175]     0  2175     1064      394       8        0             0 acpid

Jul 14 00:27:34.395808 [   22.814085] [ 2211]     0  2211    19431     1167      41        0             0 apache2

Jul 14 00:27:34.403799 [   22.814091] [ 2214]    33  2214    91722     1032      69        0             0 apache2

Jul 14 00:27:34.411802 [   22.814098] [ 2216]    33  2216    91722     1031      69        0             0 apache2

Jul 14 00:27:34.419800 [   22.814104] [ 2422]     0  2422     4756       42      13        0             0 atd

Jul 14 00:27:34.427806 [   22.814110] [ 2423]     0  2423     6869      590      18        0             0 cron

Jul 14 00:27:34.435804 [   22.814117] [ 2606]     0  2606     8780      883      21        0             0 oxenstored

Jul 14 00:27:34.443846 [   22.814123] [ 2619]   105  2619    10560      551      24        0             0 dbus-daemon

Jul 14 00:27:34.451802 [   22.814130] [ 2624]     0  2624    13796      700      28        0         -1000 sshd

Jul 14 00:27:34.459799 [   22.814136] [ 2631]     0  2631    22682       57      19        0             0 xenconsoled

Jul 14 00:27:34.467815 [   22.814143] [ 2643]   104  2643    13313      828      26        0             0 exim4

Jul 14 00:27:34.475801 [   22.814149] [ 2652]     0  2652    60164     1974      78        0             0 qemu-system-i38

Jul 14 00:27:34.483868 [   22.814156] [ 2656]     0  2656    20648     1510      42        0             0 sshd

Jul 14 00:27:34.491859 [   22.814162] [ 2660]   107  2660     8345     1093      21        0             0 ntpd

Jul 14 00:27:34.499864 [   22.814169] [ 2673]     0  2673     1084      411       7        0             0 rc.local

Jul 14 00:27:34.507852 [   22.814175] [ 2676]     0  2676     1084       44       6        0             0 rc.local

Jul 14 00:27:34.515859 [   22.814181] [ 2677]  1000  2677    20648      946      41        0             0 sshd

Jul 14 00:27:34.523849 [   22.814188] Out of memory: Kill process 2652 (qemu-system-i38) score 3 or sacrifice child

Jul 14 00:27:34.531862 [   22.814194] Killed process 2652 (qemu-system-i38) total-vm:240656kB, anon-rss:2440kB, file-rss:5456kB

Jul 14 00:27:34.539871 [   22.822755] sshd invoked oom-killer: gfp_mask=0x2000d0, order=0, oom_score_adj=0

Jul 14 00:27:34.547800 [   22.822760] sshd cpuset=/ mems_allowed=0

Jul 14 00:27:34.555868 [   22.822764] CPU: 1 PID: 2677 Comm: sshd Not tainted 3.18.37 #1

Jul 14 00:27:34.563847 [   22.822767] Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS S1200BT.86B.02.00.0042.050820141549 05/08/2014

Jul 14 00:27:34.571852 [   22.822771]  0000000000000000 ffff8800024fb990 ffffffff817dcab5 00000000002000d0

Jul 14 00:27:34.579860 [   22.822775]  0000000000000000 ffff8800024fb9f0 ffffffff8118082f ffff8800024fb9b0

Jul 14 00:27:34.587803 [   22.822779]  ffffffff8114c315 ffffffff817e5ee9 0000000000000001 ffff8800024fb9f0

Jul 14 00:27:34.595862 [   22.822783] Call Trace:

Jul 14 00:27:34.595892 [   22.822787]  [<ffffffff817dcab5>] dump_stack+0x7c/0x98

Jul 14 00:27:34.603859 [   22.822791]  [<ffffffff8118082f>] dump_header.isra.11+0x8f/0x1e0

Jul 14 00:27:34.611843 [   22.822794]  [<ffffffff8114c315>] ? __delayacct_freepages_end+0x45/0x50

Jul 14 00:27:34.619891 [   22.822798]  [<ffffffff817e5ee9>] ? _raw_spin_unlock_irqrestore+0x29/0x90

Jul 14 00:27:34.619933 [   22.822801]  [<ffffffff8131f93e>] ? ___ratelimit+0xae/0x160

Jul 14 00:27:34.627853 [   22.822804]  [<ffffffff81180ddc>] oom_kill_process+0x20c/0x370

Jul 14 00:27:34.635900 [   22.822808]  [<ffffffff810bbc69>] ? has_capability_noaudit+0x19/0x20

Jul 14 00:27:34.643851 [   22.822812]  [<ffffffff811812d1>] out_of_memory+0x211/0x330

Jul 14 00:27:34.643888 [   22.822815]  [<ffffffff81186331>] __alloc_pages_nodemask+0xb11/0xb50

Jul 14 00:27:34.651845 [   22.822820]  [<ffffffff811cb40c>] new_slab+0x2bc/0x320

Jul 14 00:27:34.660009 [   22.822823]  [<ffffffff811cc770>] __slab_alloc.constprop.69+0x440/0x570

Jul 14 00:27:34.667913 [   22.822828]  [<ffffffff81054f3e>] ? xen_leave_lazy_mmu+0xe/0x10

Jul 14 00:27:34.667952 [   22.822831]  [<ffffffff811b6721>] ? anon_vma_fork+0x71/0x140

Jul 14 00:27:34.675855 [   22.822835]  [<ffffffff811b651b>] ? anon_vma_clone+0x6b/0x200

Jul 14 00:27:34.683863 [   22.822838]  [<ffffffff811b6721>] ? anon_vma_fork+0x71/0x140

Jul 14 00:27:34.691849 [   22.822841]  [<ffffffff811cdad6>] kmem_cache_alloc+0x116/0x150

Jul 14 00:27:34.691887 [   22.822844]  [<ffffffff811b6721>] anon_vma_fork+0x71/0x140

Jul 14 00:27:34.699848 [   22.822848]  [<ffffffff810b2017>] copy_process.part.45+0x1687/0x1a50

Jul 14 00:27:34.707853 [   22.822852]  [<ffffffff810b2577>] do_fork+0xb7/0x3b0

Jul 14 00:27:34.707888 [   22.822856]  [<ffffffff811f3a34>] ? get_unused_fd_flags+0x34/0x40

Jul 14 00:27:34.715857 [   22.822860]  [<ffffffff811f3a88>] ? __fd_install+0x48/0x60

Jul 14 00:27:34.723872 [   22.822862]  [<ffffffff810b28f1>] SyS_clone+0x11/0x20

Jul 14 00:27:34.723910 [   22.822866]  [<ffffffff817e68c9>] stub_clone+0x69/0x90

Jul 14 00:27:34.731848 [   22.822868]  [<ffffffff817e6589>] ? system_call_fastpath+0x12/0x17

Jul 14 00:27:34.739823 [   22.822871] Mem-Info:

Jul 14 00:27:34.739841 [   22.822872] DMA per-cpu:

Jul 14 00:27:34.747843 [   22.822874] CPU    0: hi:    0, btch:   1 usd:   0

Jul 14 00:27:34.747879 [   22.822876] CPU    1: hi:    0, btch:   1 usd:   0

Jul 14 00:27:34.755836 [   22.822878] CPU    2: hi:    0, btch:   1 usd:   0

Jul 14 00:27:34.755870 [   22.822880] CPU    3: hi:    0, btch:   1 usd:   0

Jul 14 00:27:34.763850 [   22.822882] DMA32 per-cpu:

Jul 14 00:27:34.763881 [   22.822884] CPU    0: hi:  186, btch:  31 usd:  88

Jul 14 00:27:34.771838 [   22.822886] CPU    1: hi:  186, btch:  31 usd: 157

Jul 14 00:27:34.779837 [   22.822888] CPU    2: hi:  186, btch:  31 usd:  47

Jul 14 00:27:34.779872 [   22.822890] CPU    3: hi:  186, btch:  31 usd: 173

Jul 14 00:27:34.787884 [   22.822893] active_anon:0 inactive_anon:0 isolated_anon:0

Jul 14 00:27:34.787920 [   22.822893]  active_file:0 inactive_file:0 isolated_file:0

Jul 14 00:27:34.795844 [   22.822893]  unevictable:69 dirty:9 writeback:0 unstable:0

Jul 14 00:27:34.803834 [   22.822893]  free:1036 slab_reclaimable:1887 slab_unreclaimable:4091

Jul 14 00:27:34.811846 [   22.822893]  mapped:3101 shmem:154 pagetables:650 bounce:0

Jul 14 00:27:34.811884 [   22.822893]  free_cma:0

Jul 14 00:27:34.819842 [   22.822904] DMA free:1680kB min:96kB low:120kB high:144kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:1048kB dirty:0kB writeback:0kB mapped:336kB shmem:8kB slab_reclaimable:92kB slab_unreclaimable:328kB kernel_stack:64kB pagetables:148kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

Jul 14 00:27:34.859831 [   22.822914] lowmem_reserve[]: 0 396 396 396

Jul 14 00:27:34.859866 [   22.822920] DMA32 free:2464kB min:2496kB low:3120kB high:3744kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:272kB isolated(anon):0kB isolated(file):0kB present:508308kB managed:408952kB mlocked:32180kB dirty:36kB writeback:0kB mapped:12068kB shmem:608kB slab_reclaimable:7456kB slab_unreclaimable:16036kB kernel_stack:2720kB pagetables:2452kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

Jul 14 00:27:34.899849 [   22.822930] lowmem_reserve[]: 0 0 0 0

Jul 14 00:27:34.907838 [   22.822934] DMA: 0*4kB 0*8kB 1*16kB (R) 0*32kB 0*64kB 1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1680kB

Jul 14 00:27:34.915876 [   22.822943] DMA32: 0*4kB 0*8kB 1*16kB (R) 1*32kB (R) 0*64kB 1*128kB (R) 1*256kB (R) 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB = 2480kB

Jul 14 00:27:34.931842 [   22.822953] 15037 total pagecache pages

Jul 14 00:27:34.931878 [   22.822955] 0 pages in swap cache

Jul 14 00:27:34.939850 [   22.822957] Swap cache stats: add 0, delete 0, find 0/0

Jul 14 00:27:34.939888 [   22.822959] Free swap  = 1949692kB

Jul 14 00:27:34.947842 [   22.822960] Total swap = 1949692kB

Jul 14 00:27:34.947874 [   22.822962] 131071 pages RAM

Jul 14 00:27:34.947903 [   22.822963] 0 pages HighMem/MovableOnly

Jul 14 00:27:34.955868 [   22.822965] 24860 pages reserved

Jul 14 00:27:34.955900 [   22.822966] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name

Jul 14 00:27:34.963860 [   22.822971] [ 1316]     0  1316     9825      819      21        0         -1000 udevd

Jul 14 00:27:34.971874 [   22.822976] [ 2038]     0  2038     1084       25       7        0             0 ntpdate

Jul 14 00:27:34.979864 [   22.822978] [ 2040]     0  2040     1569      172       9        0             0 lockfile-create

Jul 14 00:27:34.987862 [   22.822981] [ 2047]     0  2047     1060      331       7        0             0 startpar

Jul 14 00:27:34.995869 [   22.822984] [ 2061]     0  2061     9270      668      24        0             0 rpcbind

Jul 14 00:27:35.003863 [   22.822987] [ 2076]   106  2076     9320      700      22        0             0 rpc.statd

Jul 14 00:27:35.019848 [   22.822991] [ 2088]     0  2088     6892       58      18        0             0 rpc.idmapd

Jul 14 00:27:35.027853 [   22.822994] [ 2131]     0  2131     1084      410       7        0             0 rc

Jul 14 00:27:35.035843 [   22.822997] [ 2137]     0  2137     3202      648      13        0             0 startpar

Jul 14 00:27:35.043857 [   22.823000] [ 2160]     0  2160    65721      788      33        0             0 rsyslogd

Jul 14 00:27:35.051846 [   22.823003] [ 2175]     0  2175     1064      394       8        0             0 acpid

Jul 14 00:27:35.059852 [   22.823006] [ 2211]     0  2211    19431     1167      41        0             0 apache2

Jul 14 00:27:35.067848 [   22.823009] [ 2214]    33  2214    91722     1032      69        0             0 apache2

Jul 14 00:27:35.075852 [   22.823012] [ 2216]    33  2216    91722     1031      69        0             0 apache2

Jul 14 00:27:35.083847 [   22.823015] [ 2422]     0  2422     4756       42      13        0             0 atd

Jul 14 00:27:35.091857 [   22.823018] [ 2423]     0  2423     6869      590      18        0             0 cron

Jul 14 00:27:35.099857 [   22.823021] [ 2606]     0  2606     8780      883      21        0             0 oxenstored

Jul 14 00:27:35.107854 [   22.823024] [ 2619]   105  2619    10560      551      24        0             0 dbus-daemon

Jul 14 00:27:35.115866 [   22.823027] [ 2624]     0  2624    13796      700      28        0         -1000 sshd

Jul 14 00:27:35.123797 [   22.823030] [ 2631]     0  2631    22682       57      19        0             0 xenconsoled

Jul 14 00:27:35.131849 [   22.823034] [ 2643]   104  2643    13313      828      26        0             0 exim4

Jul 14 00:27:35.139844 [   22.823037] [ 2656]     0  2656    20648     1510      42        0             0 sshd

Jul 14 00:27:35.147846 [   22.823040] [ 2660]   107  2660     8345     1093      21        0             0 ntpd

Jul 14 00:27:35.155856 [   22.823043] [ 2673]     0  2673     1084      411       7        0             0 rc.local

Jul 14 00:27:35.163847 [   22.823046] [ 2676]     0  2676     1084       44       6        0             0 rc.local

Jul 14 00:27:35.171852 [   22.823049] [ 2677]  1000  2677    20648      946      41        0             0 sshd

Jul 14 00:27:35.179883 [   22.823052] Out of memory: Kill process 2656 (sshd) score 2 or sacrifice child

Jul 14 00:27:35.187860 [   22.823055] Killed process 2677 (sshd) total-vm:82592kB, anon-rss:848kB, file-rss:2936kB

Jul 14 00:27:35.195838 [   22.825444] oxenstored invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0

Jul 14 00:27:35.203861 [   22.825449] oxenstored cpuset=/ mems_allowed=0

Jul 14 00:27:35.203897 [   22.825453] CPU: 2 PID: 2606 Comm: oxenstored Not tainted 3.18.37 #1

Jul 14 00:27:35.211845 [   22.825457] Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS S1200BT.86B.02.00.0042.050820141549 05/08/2014

Jul 14 00:27:35.227847 [   22.825461]  0000000000000000 ffff880008dc3ad8 ffffffff817dcab5 00000000000280da

Jul 14 00:27:35.235847 [   22.825466]  0000000000000000 ffff880008dc3b38 ffffffff8118082f ffff880008dc3af8

Jul 14 00:27:35.243891 [   22.825469]  ffffffff8114c315 ffffffff817e5ee9 0000000000000001 ffff880008dc3b38

Jul 14 00:27:35.243937 [   22.825473] Call Trace:

Jul 14 00:27:35.251845 [   22.825477]  [<ffffffff817dcab5>] dump_stack+0x7c/0x98

Jul 14 00:27:35.251881 [   22.825481]  [<ffffffff8118082f>] dump_header.isra.11+0x8f/0x1e0

Jul 14 00:27:35.259846 [   22.825484]  [<ffffffff8114c315>] ? __delayacct_freepages_end+0x45/0x50

Jul 14 00:27:35.267864 [   22.825488]  [<ffffffff817e5ee9>] ? _raw_spin_unlock_irqrestore+0x29/0x90

Jul 14 00:27:35.275843 [   22.825492]  [<ffffffff8131f93e>] ? ___ratelimit+0xae/0x160

Jul 14 00:27:35.283904 [   22.825494]  [<ffffffff81180ddc>] oom_kill_process+0x20c/0x370

Jul 14 00:27:35.283942 [   22.825498]  [<ffffffff810bbc69>] ? has_capability_noaudit+0x19/0x20

Jul 14 00:27:35.291845 [   22.825501]  [<ffffffff811812d1>] out_of_memory+0x211/0x330

Jul 14 00:27:35.299881 [   22.825505]  [<ffffffff81186331>] __alloc_pages_nodemask+0xb11/0xb50

Jul 14 00:27:35.307834 [   22.825509]  [<ffffffff811ab988>] handle_mm_fault+0x9c8/0xd50

Jul 14 00:27:35.307871 [   22.825513]  [<ffffffff811ea170>] ? poll_select_copy_remaining+0x130/0x130

Jul 14 00:27:35.315942 [   22.825517]  [<ffffffff810a68dd>] __do_page_fault+0x1bd/0x590

Jul 14 00:27:35.323846 [   22.825521]  [<ffffffff[   23.782240] getty invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0

Jul 14 00:27:35.331856 [   23.782247] getty cpuset=/ mems_allowed=0

Jul 14 00:27:35.339842 [   23.782252] CPU: 3 PID: 2685 Comm: getty Not tainted 3.18.37 #1

Jul 14 00:27:35.339884 [   23.782255] Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS S1200BT.86B.02.00.0042.050820141549 05/08/2014

Jul 14 00:27:35.355849 [   23.782259]  0000000000000000 ffff8800158db7f8 ffffffff817dcab5 00000000000280da

Jul 14 00:27:35.363925 [   23.782263]  0000000000000000 ffff8800158db858 ffffffff8118082f ffff8800158db818

Jul 14 00:27:35.371864 [   23.782268]  ffffffff8114c315 ffffffff817e5ee9 0000000000000001 ffff8800158db858

Jul 14 00:27:35.379855 [   23.782272] Call Trace:

Jul 14 00:27:35.379886 [   23.782278]  [<ffffffff817dcab5>] dump_stack+0x7c/0x98

Jul 14 00:27:35.387850 [   23.782283]  [<ffffffff8118082f>] dump_header.isra.11+0x8f/0x1e0

Jul 14 00:27:35.387888 [   23.782287]  [<ffffffff8114c315>] ? __delayacct_freepages_end+0x45/0x50

Jul 14 00:27:35.395877 [   23.782291]  [<ffffffff817e5ee9>] ? _raw_spin_unlock_irqrestore+0x29/0x90

Jul 14 00:27:35.403858 [   23.782295]  [<ffffffff8131f93e>] ? ___ratelimit+0xae/0x160

Jul 14 00:27:35.411839 [   23.782298]  [<ffffffff81180ddc>] oom_kill_process+0x20c/0x370

Jul 14 00:27:35.419845 [   23.782301]  [<ffffffff811812d1>] out_of_memory+0x211/0x330

Jul 14 00:27:35.419883 [   23.782305]  [<ffffffff81186331>] __alloc_pages_nodemask+0xb11/0xb50

Jul 14 00:27:35.427868 [   23.782309]  [<ffffffff811ab988>] handle_mm_fault+0x9c8/0xd50

Jul 14 00:27:35.435856 [   23.782320]  [<ffffffffa009d912>] ? search_dir+0xc2/0x120 [ext4]

Jul 14 00:27:35.443840 [   23.782324]  [<ffffffff810a68dd>] __do_page_fault+0x1bd/0x590

Jul 14 00:27:35.443877 [   23.782329]  [<ffffffff81233736>] ? dquot_file_open+0x16/0x50

Jul 14 00:27:35.451860 [   23.782334]  [<ffffffffa008c953>] ? ext4_file_open+0x43/0x180 [ext4]

Jul 14 00:27:35.459849 [   23.782339]  [<ffffffffa008c910>] ? ext4_check_all_de+0xb0/0xb0 [ext4]

Jul 14 00:27:35.467884 [   23.782343]  [<ffffffff810a6cdc>] do_page_fault+0x2c/0x40

Jul 14 00:27:35.467920 [   23.782346]  [<ffffffff817e81b8>] page_fault+0x28/0x30

Jul 14 00:27:35.475841 [   23.782350]  [<ffffffff811a47a8>] ? copy_page_to_iter+0x68/0x3c0

Jul 14 00:27:35.483845 [   23.782353]  [<ffffffff8117ed88>] generic_file_read_iter+0x278/0x5d0

Jul 14 00:27:35.491842 [   23.782357]  [<ffffffff811d4ad9>] new_sync_read+0x79/0xb0

Jul 14 00:27:35.491880 [   23.782360]  [<ffffffff811d5303>] vfs_read+0x93/0x180

Jul 14 00:27:35.499846 [   23.782364]  [<ffffffff811d5e61>] SyS_read+0x41/0xb0

Jul 14 00:27:35.507832 [   23.782367]  [<ffffffff817e6589>] system_call_fastpath+0x12/0x17

Jul 14 00:27:35.507869 [   23.782370] Mem-Info:

Jul 14 00:27:35.515838 [   23.782371] DMA per-cpu:

Jul 14 00:27:35.515869 [   23.782373] CPU    0: hi:    0, btch:   1 usd:   0

Jul 14 00:27:35.523838 [   23.782375] CPU    1: hi:    0, btch:   1 usd:   0

Jul 14 00:27:35.523873 [   23.782377] CPU    2: hi:    0, btch:   1 usd:   0

Jul 14 00:27:35.531843 [   23.782379] CPU    3: hi:    0, btch:   1 usd:   0

Jul 14 00:27:35.531878 [   23.782381] DMA32 per-cpu:

Jul 14 00:27:35.539836 [   23.782382] CPU    0: hi:  186, btch:  31 usd:   0

Jul 14 00:27:35.539873 [   23.782384] CPU    1: hi:  186, btch:  31 usd:   0

Jul 14 00:27:35.547841 [   23.782386] CPU    2: hi:  186, btch:  31 usd:   0

Jul 14 00:27:35.555833 [   23.782388] CPU    3: hi:  186, btch:  31 usd:   0

Jul 14 00:27:35.555868 [   23.782392] active_anon:0 inactive_anon:0 isolated_anon:0

Jul 14 00:27:35.563857 [   23.782392]  active_file:0 inactive_file:0 isolated_file:0

Jul 14 00:27:35.563895 [   23.782392]  unevictable:1 dirty:0 writeback:18 unstable:0

Jul 14 00:27:35.571859 [   23.782392]  free:1060 slab_reclaimable:1888 slab_unreclaimable:4074

Jul 14 00:27:35.579860 [   23.782392]  mapped:2923 shmem:153 pagetables:588 bounce:0

Jul 14 00:27:35.587840 [   23.782392]  free_cma:0

Jul 14 00:27:35.587872 [   23.782403] DMA free:1668kB min:96kB low:120kB high:144kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:1044kB dirty:0kB writeback:4kB mapped:340kB shmem:8kB slab_reclaimable:92kB slab_unreclaimable:332kB kernel_stack:48kB pagetables:116kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

Jul 14 00:27:35.627846 [   23.782413] lowmem_reserve[]: 0 396 396 396

Jul 14 00:27:35.627882 [   23.782419] DMA32 free:2572kB min:2496kB low:3120kB high:3744kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:508308kB managed:408952kB mlocked:31912kB dirty:0kB writeback:68kB mapped:11352kB shmem:604kB slab_reclaimable:7460kB slab_unreclaimable:15964kB kernel_stack:2400kB pagetables:2236kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:36 all_unreclaimable? yes

Jul 14 00:27:35.675846 [   23.782431] lowmem_reserve[]: 0 0 0 0

Jul 14 00:27:35.675882 [   23.782433] DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1664kB

Jul 14 00:27:35.683866 [   23.782442] DMA32: 29*4kB (UR) 0*8kB 3*16kB (R) 0*32kB 1*64kB (R) 1*128kB (R) 1*256kB (R) 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB = 2660kB

Jul 14 00:27:35.699851 [   23.782452] 15076 total pagecache pages

Jul 14 00:27:35.699885 [   23.782455] 0 pages in swap cache

Jul 14 00:27:35.707849 [   23.782465] Swap cache stats: add 0, delete 0, find 0/0

Jul 14 00:27:35.715836 [   23.782468] Free swap  = 1949692kB

Jul 14 00:27:35.715869 [   23.782472] Total swap = 1949692kB

Jul 14 00:27:35.715899 [   23.782476] 131071 pages RAM

Jul 14 00:27:35.723847 [   23.782480] 0 pages HighMem/MovableOnly

Jul 14 00:27:35.723882 [   23.782484] 24860 pages reserved

Jul 14 00:27:35.731838 [   23.782487] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name

Jul 14 00:27:35.739850 [   23.782495] [ 1316]     0  1316     9825      819      21        0         -1000 udevd

Jul 14 00:27:35.747851 [   23.782501] [ 2038]     0  2038     1084       25       7        0             0 ntpdate

Jul 14 00:27:35.755844 [   23.782506] [ 2040]     0  2040     1569      172       9        0             0 lockfile-create

Jul 14 00:27:35.763856 [   23.782509] [ 2047]     0  2047     1060      331       7        0             0 startpar

Jul 14 00:27:35.771890 [   23.782512] [ 2061]     0  2061     9270      668      24        0             0 rpcbind

Jul 14 00:27:35.779829 [   23.782516] [ 2076]   106  2076     9320      700      22        0             0 rpc.statd

Jul 14 00:27:35.787856 [   23.782519] [ 2088]     0  2088     6892       58      18        0             0 rpc.idmapd

Jul 14 00:27:35.795799 [   23.782522] [ 2160]     0  2160    65721      785      33        0             0 rsyslogd

Jul 14 00:27:35.803857 [   23.782525] [ 2175]     0  2175     1064      394       8        0             0 acpid

Jul 14 00:27:35.811854 [   23.782528] [ 2211]     0  2211    19431     1167      41        0             0 apache2

Jul 14 00:27:35.820250 [   23.782532] [ 2216]    33  2216    91722     1031      69        0             0 apache2

Jul 14 00:27:35.827884 [   23.782534] [ 2422]     0  2422     4756       42      13        0             0 atd

Jul 14 00:27:35.835897 [   23.782537] [ 2423]     0  2423     6869      590      18        0             0 cron

Jul 14 00:27:35.843893 [   23.782541] [ 2606]     0  2606     8780      883      21        0             0 oxenstored

Jul 14 00:27:35.851883 [   23.782544] [ 2619]   105  2619    10560      551      24        0             0 dbus-daemon

Jul 14 00:27:35.859858 [   23.782547] [ 2624]     0  2624    13796      700      28        0         -1000 sshd

Jul 14 00:27:35.867866 [   23.782550] [ 2631]     0  2631    22682       57      19        0             0 xenconsoled

Jul 14 00:27:35.875902 [   23.782554] [ 2643]   104  2643    13313      828      26        0             0 exim4

Jul 14 00:27:35.883887 [   23.782557] [ 2660]   107  2660     8346     1093      21        0             0 ntpd

Jul 14 00:27:35.891912 [   23.782560] [ 2681]     0  2681     2650      399      10        0             0 getty

Jul 14 00:27:35.899869 [   23.782563] [ 2682]     0  2682     3166      417      12        0             0 getty

Jul 14 00:27:35.907853 [   23.782566] [ 2683]     0  2683     1063      173       7        0             0 getty

Jul 14 00:27:35.915859 [   23.782569] [ 2684]     0  2684     3166      416      12        0             0 getty

Jul 14 00:27:35.923807 [   23.782572] [ 2685]     0  2685     3560      425      12        0             0 getty

Jul 14 00:27:35.931919 [   23.782575] [ 2686]     0  2686     3166      425      11        0             0 getty

Jul 14 00:27:35.939914 [   23.782578] [ 2687]     0  2687     1591      404       8        0             0 getty

Jul 14 00:27:35.947888 [   23.782581] [ 2688]     0  2688     3166      437      12        0             0 getty

Jul 14 00:27:35.956047 [   23.782587] [ 2690]     0  2690     3166       35       9        0             0 getty

Jul 14 00:27:35.964079 [   23.782592] [ 2691]     0  2691     3166       34       9        0             0 getty

Jul 14 00:27:35.972103 [   23.782597] [ 2693]     0  2693     9825      195      19        0         -1000 udevd

Jul 14 00:27:35.980089 [   23.782601] [ 2694]     0  2694     9825      195      18        0         -1000 udevd

Jul 14 00:27:35.987933 [   23.782604] Out of memory: Kill process 2211 (apache2) score 1 or sacrifice child

Jul 14 00:27:35.996043 [   23.782607] Killed process 2216 (apache2) total-vm:366888kB, anon-rss:1556kB, file-rss:2568kB

Jul 14 00:27:36.004061 [   23.790140] getty invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0

Jul 14 00:27:36.011954 [   23.790145] getty cpuset=/ mems_allowed=0

Jul 14 00:27:36.019911 [   23.790150] CPU: 0 PID: 2681 Comm: getty Not tainted 3.18.37 #1

Jul 14 00:27:36.019951 [   23.790152] Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS S1200BT.86B.02.00.0042.050820141549 05/08/2014

Jul 14 00:27:36.035934 [   23.790156]  0000000000000000 ffff88001220ba78 ffffffff817dcab5 00000000000200da

Jul 14 00:27:36.043951 [   23.790161]  0000000000000000 ffff88001220bad8 ffffffff8118082f ffff88001220ba98

Jul 14 00:27:36.051973 [   23.790166]  ffffffff8114c315 ffffffff817e5ee9 0000000000000001 ffff88001220bad8

Jul 14 00:27:36.059965 [   23.790170] Call Trace:

Jul 14 00:27:36.059997 [   23.790174]  [<ffffffff817dcab5>] dump_stack+0x7c/0x98

Jul 14 00:27:36.067938 [   23.790178]  [<ffffffff8118082f>] dump_header.isra.11+0x8f/0x1e0

Jul 14 00:27:36.067977 [   23.790181]  [<ffffffff8114c315>] ? __delayacct_freepages_end+0x45/0x50

Jul 14 00:27:36.076024 [   23.790185]  [<ffffffff817e5ee9>] ? _raw_spin_unlock_irqrestore+0x29/0x90

Jul 14 00:27:36.083947 [   23.790188]  [<ffffffff8131f93e>] ? ___ratelimit+0xae/0x160

Jul 14 00:27:36.092010 [   23.790191]  [<ffffffff81180ddc>] oom_kill_process+0x20c/0x370

Jul 14 00:27:36.099947 [   23.790195]  [<ffffffff811812d1>] out_of_memory+0x211/0x330

Jul 14 00:27:36.099991 [   23.790198]  [<ffffffff81186331>] __alloc_pages_nodemask+0xb11/0xb50

Jul 14 00:27:36.107888 [   23.790202]  [<ffffffff811a9757>] do_cow_fault.isra.96+0x47/0x2a0

Jul 14 00:27:36.116087 [   23.790207]  [<ffffffff81054bf9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e

Jul 14 00:27:36.123936 [   23.790210]  [<ffffffff811ab675>] handle_mm_fault+0x6b5/0xd50

Jul 14 00:27:36.124004 [   23.790214]  [<ffffffff810a68dd>] __do_page_fault+0x1bd/0x590

Jul 14 00:27:36.132061 [   23.790217]  [<ffffffff81198939>] ? vm_mmap_pgoff+0x99/0xc0

Jul 14 00:27:36.140090 [   23.790220]  [<ffffffff810a6cdc>] do_page_fault+0x2c/0x40

Jul 14 00:27:36.148022 [   23.790224]  [<ffffffff817e81b8>] page_fault+0x28/0x30

Jul 14 00:27:36.148057 [   23.790226] Mem-Info:

Jul 14 00:27:36.148083 [   23.790227] DMA per-cpu:

Jul 14 00:27:36.155974 [   23.790229] CPU    0: hi:    0, btch:   1 usd:   0

Jul 14 00:27:36.155995 [   23.790231] CPU    1: hi:    0, btch:   1 usd:   0

Jul 14 00:27:36.164132 [   23.790233] CPU    2: hi:    0, btch:   1 usd:   0

Jul 14 00:27:36.172145 [   23.790235] CPU    3: hi:    0, btch:   1 usd:   0

Jul 14 00:27:36.172208 [   23.790237] DMA32 per-cpu:

Jul 14 00:27:36.180000 [   23.790238] CPU    0: hi:  186, btch:  31 usd:   0

Jul 14 00:27:36.180049 [   23.790241] CPU    1: hi:  186, btch:  31 usd: 128

Jul 14 00:27:36.187954 [   23.790243] CPU    2: hi:  186, btch:  31 usd:   0

Jul 14 00:27:36.187996 [   23.790245] CPU    3: hi:  186, btch:  31 usd:   0

Jul 14 00:27:36.195965 [   23.790248] active_anon:0 inactive_anon:0 isolated_anon:0

Jul 14 00:27:36.203964 [   23.790248]  active_file:0 inactive_file:0 isolated_file:0

Jul 14 00:27:36.204010 [   23.790248]  unevictable:1 dirty:4 writeback:0 unstable:0

Jul 14 00:27:36.211969 [   23.790248]  free:1012 slab_reclaimable:1888 slab_unreclaimable:4067

Jul 14 00:27:36.219948 [   23.790248]  mapped:2877 shmem:154 pagetables:521 bounce:0

Jul 14 00:27:36.227943 [   23.790248]  free_cma:0

Jul 14 00:27:36.227982 [   23.790259] DMA free:1680kB min:96kB low:120kB high:144kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:1044kB dirty:0kB writeback:0kB mapped:340kB shmem:8kB slab_reclaimable:92kB slab_unreclaimable:332kB kernel_stack:48kB pagetables:84kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

Jul 14 00:27:36.267995 [   23.790269] lowmem_reserve[]: 0 396 396 396

Jul 14 00:27:36.268052 [   23.790275] DMA32 free:2368kB min:2496kB low:3120kB high:3744kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:508308kB managed:408952kB mlocked:319
Jul 14 00:28:18.168603 <client 0x1c9b400 connected - now 1 clients>



From: osstest service owner <osstest-admin@xxxxxxxxxxxxxx>
To: <xen-devel@xxxxxxxxxxxxxxxxxxx>, <osstest-admin@xxxxxxxxxxxxxx>
Subject: [linux-3.18 bisection] complete test-amd64-amd64-xl-credit2
Date: Sat, 16 Jul 2016 14:52:57 +0000

branch xen-unstable
xenbranch xen-unstable
job test-amd64-amd64-xl-credit2
testid xen-boot

Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
  Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/97435/


  commit a2d8c514753276394d68414f563591f174ef86cb
  Author: Lukasz Odzioba <lukasz.odzioba@xxxxxxxxx>
  Date:   Fri Jun 24 14:50:01 2016 -0700
  
      mm/swap.c: flush lru pvecs on compound page arrival
      
      [ Upstream commit 8f182270dfec432e93fae14f9208a6b9af01009f ]
      
      Currently we can have compound pages held on per cpu pagevecs, which
      leads to a lot of memory unavailable for reclaim when needed.  In the
      systems with hundreads of processors it can be GBs of memory.
      
      On of the way of reproducing the problem is to not call munmap
      explicitly on all mapped regions (i.e.  after receiving SIGTERM).  After
      that some pages (with THP enabled also huge pages) may end up on
      lru_add_pvec, example below.
      
        void main() {
        #pragma omp parallel
        {
      	size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
      	void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
      		MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
      	if (p != MAP_FAILED)
      		memset(p, 0, size);
      	//munmap(p, size); // uncomment to make the problem go away
        }
        }
      
      When we run it with THP enabled it will leave significant amount of
      memory on lru_add_pvec.  This memory will be not reclaimed if we hit
      OOM, so when we run above program in a loop:
      
      	for i in `seq 100`; do ./a.out; done
      
      many processes (95% in my case) will be killed by OOM.
      
      The primary point of the LRU add cache is to save the zone lru_lock
      contention with a hope that more pages will belong to the same zone and
      so their addition can be batched.  The huge page is already a form of
      batched addition (it will add 512 worth of memory in one go) so skipping
      the batching seems like a safer option when compared to a potential
      excess in the caching which can be quite large and much harder to fix
      because lru_add_drain_all is way to expensive and it is not really clear
      what would be a good moment to call it.
      
      Similarly we can reproduce the problem on lru_deactivate_pvec by adding:
      madvise(p, size, MADV_FREE); after memset.
      
      This patch flushes lru pvecs on compound page arrival making the problem
      less severe - after applying it kill rate of above example drops to 0%,
      due to reducing maximum amount of memory held on pvec from 28MB (with
      THP) to 56kB per CPU.
      
      Suggested-by: Michal Hocko <mhocko@xxxxxxxx>
      Link: http://lkml.kernel.org/r/1466180198-18854-1-git-send-email-lukasz.odzioba@xxxxxxxxx
      Signed-off-by: Lukasz Odzioba <lukasz.odzioba@xxxxxxxxx>
      Acked-by: Michal Hocko <mhocko@xxxxxxxx>
      Cc: Kirill Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
      Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
      Cc: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx>
      Cc: Ming Li <mingli199x@xxxxxx>
      Cc: Minchan Kim <minchan@xxxxxxxxxx>
      Cc: <stable@xxxxxxxxxxxxxxx>
      Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
      Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
      Signed-off-by: Sasha Levin <sasha.levin@xxxxxxxxxx>


For bisection revision-tuple graph see:
   http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-3.18/test-amd64-amd64-xl-credit2.xen-boot.html
Revision IDs in each graph node refer, respectively, to the Trees above.

----------------------------------------
Running cs-bisection-step --graph-out=/home/logs/results/bisect/linux-3.18/test-amd64-amd64-xl-credit2.xen-boot --summary-out=tmp/97435.bisection-summary --basis-template=96188 --blessings=real,real-bisect linux-3.18 test-amd64-amd64-xl-credit2 xen-boot
Searching for failure / basis pass:
 97377 fail [host=chardonnay0] / 96188 [host=italia0] 96161 [host=baroque1] 95844 [host=godello0] 95809 [host=godello1] 95597 [host=fiano0] 95521 [host=pinot1] 95458 [host=elbling1] 95406 [host=fiano1] 94728 [host=merlot1] 94153 [host=elbling0] 94083 ok.
Failure / basis pass flights: 97377 / 94083
(tree with no url: minios)
(tree with no url: ovmf)
(tree with no url: seabios)
Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git
Latest 0ac0a856d986c1ab240753479f5e50fdfab82b14 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f ea210c52abb6458e39f5365f7f2c3abb9c191c47
Basis pass 6b12ebc0ecce75d7bd3660cd85f8b47a615c2071 c530a75c1e6a472b0eb9558310b518f0dfcd8860 e4ceb77cf88bc44f0b7fe39225c49d660735f327 62b3d206425c245ed0a020390a64640d40d97471 c79fc6c4bee28b40948838a760b4aaadf6b5cd47
Generating revisions with ./adhoc-revtuple-generator  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git#6b12ebc0ecce75d7bd3660cd85f8b47a615c2071-0ac0a856d986c1ab240753479f5e50fdfab82b14 git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860 git://xenbits.xen.org/qemu-xen-traditional.git#e4ceb77cf88bc44f0b7fe39225c49d660735f327-6e20809727261599e8527c456eb078c0e89139a1 git://xenbits.xen.org/qemu-xen.git#62b3d206425c245ed0a020390a64640d40d97471-44a072f0de0d57c95c2212bbce02888832b7b74f git://xenbits.xen.org/xen.git#c79fc6c4bee28b40948838a760b4aaadf6b5cd47-ea210c52abb6458e39f5365f7f2c3abb9c191c47
Loaded 12048 nodes in revision graph
Searching for test results:
 94035 [host=huxelrebe0]
 94083 pass 6b12ebc0ecce75d7bd3660cd85f8b47a615c2071 c530a75c1e6a472b0eb9558310b518f0dfcd8860 e4ceb77cf88bc44f0b7fe39225c49d660735f327 62b3d206425c245ed0a020390a64640d40d97471 c79fc6c4bee28b40948838a760b4aaadf6b5cd47
 94056 [host=huxelrebe1]
 94153 [host=elbling0]
 94728 [host=merlot1]
 95406 [host=fiano1]
 95458 [host=elbling1]
 95521 [host=pinot1]
 95597 [host=fiano0]
 95809 [host=godello1]
 95844 [host=godello0]
 96161 [host=baroque1]
 96188 [host=italia0]
 97278 fail irrelevant
 97289 fail 0ac0a856d986c1ab240753479f5e50fdfab82b14 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f ea210c52abb6458e39f5365f7f2c3abb9c191c47
 97321 pass 6b12ebc0ecce75d7bd3660cd85f8b47a615c2071 c530a75c1e6a472b0eb9558310b518f0dfcd8860 e4ceb77cf88bc44f0b7fe39225c49d660735f327 62b3d206425c245ed0a020390a64640d40d97471 c79fc6c4bee28b40948838a760b4aaadf6b5cd47
 97346 fail 0ac0a856d986c1ab240753479f5e50fdfab82b14 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f ea210c52abb6458e39f5365f7f2c3abb9c191c47
 97357 pass f27ca140ad82b5e76282cc5b54bfb0a665520d17 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97352 pass b5076139991c6b12c62346d9880eec1d4227d99f c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 87beb45e0b05be76755cac53322aae4f5b426aac
 97319 fail 0ac0a856d986c1ab240753479f5e50fdfab82b14 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f ea210c52abb6458e39f5365f7f2c3abb9c191c47
 97389 fail e23042d05035bd64c980ea8f1d9d311972b09104 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97404 pass faa35ed7c7dd74a62bb58340e0ba1819ec33e4e1 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97423 fail a2d8c514753276394d68414f563591f174ef86cb c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97393 pass 1b9dc6680de288cb47e0a3c1587ba69879b3c26f c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97397 pass 4c2b0216cdf54e81f7c0e841b5bb1116701ae25b c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97364 fail 30888a2ea001e237ae9960de877d6f4d2351d8a2 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97376 blocked 71c879eb92223676c4583e130f1b0ce26cddb891 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97413 pass 8f620446135b64ca6f96cf32066a76d64e79a388 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97379 fail 6d94f01566e30c87ebd42e1175ade4f648735578 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97399 fail a2d8c514753276394d68414f563591f174ef86cb c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97417 fail a2d8c514753276394d68414f563591f174ef86cb c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97421 pass 8f620446135b64ca6f96cf32066a76d64e79a388 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97377 fail 0ac0a856d986c1ab240753479f5e50fdfab82b14 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f ea210c52abb6458e39f5365f7f2c3abb9c191c47
 97428 pass 8f620446135b64ca6f96cf32066a76d64e79a388 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
 97435 fail a2d8c514753276394d68414f563591f174ef86cb c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
Searching for interesting versions
 Result found: flight 94083 (pass), for basis pass
 Result found: flight 97289 (fail), for basis failure
 Repro found: flight 97321 (pass), for basis pass
 Repro found: flight 97346 (fail), for basis failure
 0 revisions at 8f620446135b64ca6f96cf32066a76d64e79a388 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 44a072f0de0d57c95c2212bbce02888832b7b74f 22ea8ad02e465e32cd40887c750b55c3a997a288
No revisions left to test, checking graph state.
 Result found: flight 97413 (pass), for last pass
 Result found: flight 97417 (fail), for first failure
 Repro found: flight 97421 (pass), for last pass
 Repro found: flight 97423 (fail), for first failure
 Repro found: flight 97428 (pass), for last pass
 Repro found: flight 97435 (fail), for first failure

*** Found and reproduced problem changeset ***

  Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
  Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/97435/


  commit a2d8c514753276394d68414f563591f174ef86cb
  Author: Lukasz Odzioba <lukasz.odzioba@xxxxxxxxx>
  Date:   Fri Jun 24 14:50:01 2016 -0700
  
      mm/swap.c: flush lru pvecs on compound page arrival
      
      [ Upstream commit 8f182270dfec432e93fae14f9208a6b9af01009f ]
      
      Currently we can have compound pages held on per cpu pagevecs, which
      leads to a lot of memory unavailable for reclaim when needed.  In the
      systems with hundreads of processors it can be GBs of memory.
      
      On of the way of reproducing the problem is to not call munmap
      explicitly on all mapped regions (i.e.  after receiving SIGTERM).  After
      that some pages (with THP enabled also huge pages) may end up on
      lru_add_pvec, example below.
      
        void main() {
        #pragma omp parallel
        {
      	size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
      	void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
      		MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
      	if (p != MAP_FAILED)
      		memset(p, 0, size);
      	//munmap(p, size); // uncomment to make the problem go away
        }
        }
      
      When we run it with THP enabled it will leave significant amount of
      memory on lru_add_pvec.  This memory will be not reclaimed if we hit
      OOM, so when we run above program in a loop:
      
      	for i in `seq 100`; do ./a.out; done
      
      many processes (95% in my case) will be killed by OOM.
      
      The primary point of the LRU add cache is to save the zone lru_lock
      contention with a hope that more pages will belong to the same zone and
      so their addition can be batched.  The huge page is already a form of
      batched addition (it will add 512 worth of memory in one go) so skipping
      the batching seems like a safer option when compared to a potential
      excess in the caching which can be quite large and much harder to fix
      because lru_add_drain_all is way to expensive and it is not really clear
      what would be a good moment to call it.
      
      Similarly we can reproduce the problem on lru_deactivate_pvec by adding:
      madvise(p, size, MADV_FREE); after memset.
      
      This patch flushes lru pvecs on compound page arrival making the problem
      less severe - after applying it kill rate of above example drops to 0%,
      due to reducing maximum amount of memory held on pvec from 28MB (with
      THP) to 56kB per CPU.
      
      Suggested-by: Michal Hocko <mhocko@xxxxxxxx>
      Link: http://lkml.kernel.org/r/1466180198-18854-1-git-send-email-lukasz.odzioba@xxxxxxxxx
      Signed-off-by: Lukasz Odzioba <lukasz.odzioba@xxxxxxxxx>
      Acked-by: Michal Hocko <mhocko@xxxxxxxx>
      Cc: Kirill Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
      Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
      Cc: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx>
      Cc: Ming Li <mingli199x@xxxxxx>
      Cc: Minchan Kim <minchan@xxxxxxxxxx>
      Cc: <stable@xxxxxxxxxxxxxxx>
      Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
      Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
      Signed-off-by: Sasha Levin <sasha.levin@xxxxxxxxxx>

dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.719964 to fit
pnmtopng: 47 colors found
Revision graph left in /home/logs/results/bisect/linux-3.18/test-amd64-amd64-xl-credit2.xen-boot.{dot,ps,png,html,svg}.
----------------------------------------
97435: tolerable ALL FAIL

flight 97435 linux-3.18 real-bisect [real]
http://logs.test-lab.xenproject.org/osstest/logs/97435/

Failures :-/ but no regressions.

Tests which did not succeed,
including tests which could not be run:
 test-amd64-amd64-xl-credit2   6 xen-boot                fail baseline untested


jobs:
 test-amd64-amd64-xl-credit2                                  fail    


------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
    http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
    http://xenbits.xen.org/gitweb?p=osstest.git;a=summary

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]