Hi all, We have a two node replicated gluster cluster. There are two clients that each mount one share using Fuse client. On one of these clients, the gluster process was killed with this nice log: Feb 27 14:03:46 client kernel: java invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 Feb 27 14:03:46 client kernel: java cpuset=c02e902ddefa0c92d05791530938caac1867444e608d9f8531af96f206a20bb0 mems_allowed=0 Feb 27 14:03:46 client kernel: Pid: 32427, comm: java Not tainted 2.6.32-573.18.1.el6.x86_64 #1 Feb 27 14:03:46 client kernel: Call Trace: Feb 27 14:03:46 client kernel: [<ffffffff810d6d71>] ? cpuset_print_task_mems_allowed+0x91/0xb0 Feb 27 14:03:46 client kernel: [<ffffffff8112a570>] ? dump_header+0x90/0x1b0 Feb 27 14:03:46 client kernel: [<ffffffff8123320c>] ? security_real_capable_noaudit+0x3c/0x70 Feb 27 14:03:46 client kernel: [<ffffffff8112a9f2>] ? oom_kill_process+0x82/0x2a0 Feb 27 14:03:46 client kernel: [<ffffffff8112a931>] ? select_bad_process+0xe1/0x120 Feb 27 14:03:46 client kernel: [<ffffffff8112ae30>] ? out_of_memory+0x220/0x3c0 Feb 27 14:03:46 client kernel: [<ffffffff8113780c>] ? __alloc_pages_nodemask+0x93c/0x950 Feb 27 14:03:46 client kernel: [<ffffffffa008f560>] ? ext4_get_block+0x0/0x120 [ext4] Feb 27 14:03:46 client kernel: [<ffffffff8117058a>] ? alloc_pages_current+0xaa/0x110 Feb 27 14:03:46 client kernel: [<ffffffff81127967>] ? __page_cache_alloc+0x87/0x90 Feb 27 14:03:46 client kernel: [<ffffffff8112734e>] ? find_get_page+0x1e/0xa0 Feb 27 14:03:46 client kernel: [<ffffffff81128907>] ? filemap_fault+0x1a7/0x500 Feb 27 14:03:46 client kernel: [<ffffffff81151ee4>] ? __do_fault+0x54/0x530 Feb 27 14:03:46 client kernel: [<ffffffff81458ee9>] ? sock_common_recvmsg+0x39/0x50 Feb 27 14:03:46 client kernel: [<ffffffff811524b7>] ? handle_pte_fault+0xf7/0xb20 Feb 27 14:03:46 client kernel: [<ffffffff810a1460>] ? autoremove_wake_function+0x0/0x40 Feb 27 14:03:46 client kernel: [<ffffffff81153179>] ? handle_mm_fault+0x299/0x3d0 Feb 27 14:03:46 client kernel: [<ffffffff8104f156>] ? __do_page_fault+0x146/0x500 Feb 27 14:03:46 client kernel: [<ffffffff814586db>] ? sys_recvfrom+0x16b/0x180 Feb 27 14:03:46 client kernel: [<ffffffff81007ca9>] ? xen_clocksource_get_cycles+0x9/0x10 Feb 27 14:03:46 client kernel: [<ffffffff8153f48e>] ? do_page_fault+0x3e/0xa0 Feb 27 14:03:46 client kernel: [<ffffffff8153c835>] ? page_fault+0x25/0x30 Feb 27 14:03:46 client kernel: Mem-Info: Feb 27 14:03:46 client kernel: Node 0 DMA per-cpu: Feb 27 14:03:46 client kernel: CPU 0: hi: 0, btch: 1 usd: 0 Feb 27 14:03:46 client kernel: CPU 1: hi: 0, btch: 1 usd: 0 Feb 27 14:03:46 client kernel: CPU 2: hi: 0, btch: 1 usd: 0 Feb 27 14:03:46 client kernel: CPU 3: hi: 0, btch: 1 usd: 0 Feb 27 14:03:46 client kernel: CPU 4: hi: 0, btch: 1 usd: 0 Feb 27 14:03:46 client kernel: CPU 5: hi: 0, btch: 1 usd: 0 Feb 27 14:03:46 client kernel: CPU 6: hi: 0, btch: 1 usd: 0 Feb 27 14:03:46 client kernel: CPU 7: hi: 0, btch: 1 usd: 0 Feb 27 14:03:46 client kernel: Node 0 DMA32 per-cpu: Feb 27 14:03:46 client kernel: CPU 0: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 1: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 2: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 3: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 4: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 5: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 6: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 7: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: Node 0 Normal per-cpu: Feb 27 14:03:46 client kernel: CPU 0: hi: 186, btch: 31 usd: 24 Feb 27 14:03:46 client kernel: CPU 1: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 2: hi: 186, btch: 31 usd: 21 Feb 27 14:03:46 client kernel: CPU 3: hi: 186, btch: 31 usd: 0 Feb 27 14:03:46 client kernel: CPU 4: hi: 186, btch: 31 usd: 48 Feb 27 14:03:46 client kernel: CPU 5: hi: 186, btch: 31 usd: 164 Feb 27 14:03:46 client kernel: CPU 6: hi: 186, btch: 31 usd: 18 Feb 27 14:03:46 client kernel: CPU 7: hi: 186, btch: 31 usd: 10 Feb 27 14:03:46 client kernel: active_anon:8061669 inactive_anon:3 isolated_anon:0 Feb 27 14:03:46 client kernel: active_file:125 inactive_file:7110 isolated_file:0 Feb 27 14:03:46 client kernel: unevictable:0 dirty:17 writeback:16 unstable:0 Feb 27 14:03:46 client kernel: free:49585 slab_reclaimable:3251 slab_unreclaimable:23727 Feb 27 14:03:46 client kernel: mapped:593 shmem:38 pagetables:17542 bounce:0 Feb 27 14:03:46 client kernel: Node 0 DMA free:15628kB min:28kB low:32kB high:40kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15232kB
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Feb 27 14:03:46 client kernel: lowmem_reserve[]: 0 3768 32300 32300 Feb 27 14:03:46 client kernel: Node 0 DMA32 free:121860kB min:7880kB low:9848kB high:11820kB active_anon:3101704kB inactive_anon:0kB active_file:184kB inactive_file:388kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
present:3858656kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:236kB slab_unreclaimable:9036kB kernel_stack:496kB pagetables:5164kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Feb 27 14:03:46 client kernel: lowmem_reserve[]: 0 0 28532 28532 Feb 27 14:03:47 client kernel: Node 0 Normal free:60852kB min:59672kB low:74588kB high:89508kB active_anon:29144972kB inactive_anon:12kB active_file:316kB inactive_file:28752kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
present:29217280kB mlocked:0kB dirty:68kB writeback:64kB mapped:2512kB shmem:152kB slab_reclaimable:12768kB slab_unreclaimable:85872kB kernel_stack:7280kB pagetables:65004kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no Feb 27 14:03:47 client kernel: lowmem_reserve[]: 0 0 0 0 Feb 27 14:03:47 client kernel: Node 0 DMA: 1*4kB 1*8kB 2*16kB 1*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15628kB Feb 27 14:03:47 client kernel: Node 0 DMA32: 380*4kB 368*8kB 104*16kB 226*32kB 257*64kB 196*128kB 95*256kB 54*512kB 15*1024kB 0*2048kB 0*4096kB = 122224kB Feb 27 14:03:47 client kernel: Node 0 Normal: 15933*4kB 74*8kB 7*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 64436kB Feb 27 14:03:47 client kernel: 5988 total pagecache pages Feb 27 14:03:47 client kernel: 0 pages in swap cache Feb 27 14:03:47 client kernel: Swap cache stats: add 0, delete 0, find 0/0 Feb 27 14:03:47 client kernel: Free swap = 0kB Feb 27 14:03:47 client kernel: Total swap = 0kB Feb 27 14:03:47 client kernel: 8388607 pages RAM Feb 27 14:03:47 client kernel: 169218 pages reserved Feb 27 14:03:47 client kernel: 1204 pages shared Feb 27 14:03:47 client kernel: 8158288 pages non-shared Feb 27 14:03:47 client kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name Feb 27 14:03:47 client kernel: [ 522] 0 522 2812 253 0 -17 -1000 udevd Feb 27 14:03:47 client kernel: [ 1157] 0 1157 2280 124 3 0 0 dhclient Feb 27 14:03:47 client kernel: [ 1204] 0 1204 23283 67 1 -17 -1000 auditd Feb 27 14:03:47 client kernel: [ 1238] 0 1238 62338 616 3 0 0 rsyslogd Feb 27 14:03:47 client kernel: [ 1261] 0 1261 4596 76 3 0 0 irqbalance Feb 27 14:03:47 client kernel: [ 1279] 32 1279 4744 65 3 0 0 rpcbind Feb 27 14:03:47 client kernel: [ 1301] 29 1301 5837 113 0 0 0 rpc.statd Feb 27 14:03:47 client kernel: [ 1335] 81 1335 24335 89 4 0 0 dbus-daemon Feb 27 14:03:47 client kernel: [ 1357] 0 1357 47235 221 0 0 0 cupsd Feb 27 14:03:47 client kernel: [ 1447] 0 1447 4105054 3969537 0 0 0 glusterfs Feb 27 14:03:47 client kernel: [ 1471] 0 1471 1020 28 4 0 0 acpid Feb 27 14:03:47 client kernel: [ 1483] 68 1483 9458 171 3 0 0 hald Feb 27 14:03:47 client kernel: [ 1484] 0 1484 5100 51 1 0 0 hald-runner Feb 27 14:03:47 client kernel: [ 1513] 0 1513 5630 46 4 0 0 hald-addon-inpu Feb 27 14:03:47 client kernel: [ 1521] 68 1521 4502 42 5 0 0 hald-addon-acpi Feb 27 14:03:47 client kernel: [ 1545] 0 1545 96535 637 0 0 0 automount Feb 27 14:03:47 client kernel: [ 1669] 0 1669 16556 177 0 -17 -1000 sshd Feb 27 14:03:47 client kernel: [ 1748] 0 1748 20217 225 3 0 0 master Feb 27 14:03:47 client kernel: [ 1759] 89 1759 20280 220 3 0 0 qmgr Feb 27 14:03:47 client kernel: [ 1777] 0 1777 45233 245 3 0 0 abrtd Feb 27 14:03:47 client kernel: [ 1789] 0 1789 29216 156 3 0 0 crond Feb 27 14:03:47 client kernel: [ 1805] 0 1805 5276 46 3 0 0 atd Feb 27 14:03:47 client kernel: [ 1837] 0 1837 319752 4831 1 0 0 docker Feb 27 14:03:47 client kernel: [ 2107] 0 2107 27085 39 3 0 0 rhsmcertd Feb 27 14:03:47 client kernel: [ 2201] 0 2201 16081 171 3 0 0 certmonger Feb 27 14:03:47 client kernel: [ 2257] 0 2257 1016 21 7 0 0 mingetty Feb 27 14:03:47 client kernel: [ 2259] 0 2259 1016 22 3 0 0 mingetty Feb 27 14:03:47 client kernel: [ 2261] 0 2261 1016 22 3 0 0 mingetty Feb 27 14:03:47 client kernel: [ 2263] 0 2263 1016 21 3 0 0 mingetty Feb 27 14:03:47 client kernel: [ 2265] 0 2265 1016 21 3 0 0 mingetty Feb 27 14:03:47 client kernel: [ 2267] 0 2267 1016 20 3 0 0 mingetty Feb 27 14:03:47 client kernel: [ 4985] 0 4985 26827 37 6 0 0 rpc.rquotad Feb 27 14:03:47 client kernel: [ 4990] 0 4990 5443 159 3 0 0 rpc.mountd Feb 27 14:03:47 client kernel: [ 5037] 0 5037 5774 60 0 0 0 rpc.idmapd Feb 27 14:03:47 client kernel: [ 5619] 0 5619 29660 144 3 0 0 screen Feb 27 14:03:47 client kernel: [ 5620] 0 5620 27085 101 0 0 0 bash Feb 27 14:03:47 client kernel: [16661] 500 16661 29659 157 1 0 0 screen Feb 27 14:03:47 client kernel: [16662] 500 16662 27118 116 0 0 0 bash Feb 27 14:03:47 client kernel: [ 6355] 0 6355 2826 251 0 -17 -1000 udevd Feb 27 14:03:47 client kernel: [27671] 0 27671 35786 411 7 0 0 docker Feb 27 14:03:47 client kernel: [27683] 0 27683 2811 252 0 -17 -1000 udevd Feb 27 14:03:47 client kernel: [27688] 0 27688 33225 347 7 0 0 docker Feb 27 14:03:47 client kernel: [27695] 0 27695 33497 366 3 0 0 docker Feb 27 14:03:47 client kernel: [27744] 0 27744 11766 104 6 0 0 sudo Feb 27 14:03:47 client kernel: [27748] 1000 27748 4489 76 6 0 0 init.sh Feb 27 14:03:47 client kernel: [27818] 1000 27818 2325695 1157905 7 0 0 java Feb 27 14:03:47 client kernel: [27877] 1000 27877 1082 39 6 0 0 tail Feb 27 14:03:47 client kernel: [32337] 0 32337 11766 103 0 0 0 sudo Feb 27 14:03:47 client kernel: [32342] 1000 32342 4485 72 0 0 0 init.sh Feb 27 14:03:47 client kernel: [32381] 1000 32381 3824856 2912518 4 0 0 java Feb 27 14:03:47 client kernel: [32411] 1000 32411 1082 38 2 0 0 tail Feb 27 14:03:47 client kernel: [ 571] 1000 571 211038 5419 6 0 0 soffice.bin Feb 27 14:03:47 client kernel: [ 626] 1000 626 210839 5347 7 0 0 soffice.bin Feb 27 14:03:47 client kernel: [ 2124] 89 2124 20237 220 3 0 0 pickup Feb 27 14:03:47 client kernel: Out of memory: Kill process 1447 (glusterfs) score 453 or sacrifice child Feb 27 14:03:47 client kernel: Killed process 1447, UID 0, (glusterfs) total-vm:16420216kB, anon-rss:15877020kB, file-rss:1192kB Any ideas what caused this or how we can prevent this from happening? Thanks, Tom |
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users