Those are 4.x GB. Can you post dmesg output as well? Also, what's 'ulimit -l' on your system? On Fri, Jun 8, 2012 at 4:41 PM, Ling Ho <ling at slac.stanford.edu> wrote: > > This is the core file from the crash just now > > [root at psanaoss213 /]# ls -al core* > -rw------- 1 root root 4073594880 Jun 8 15:05 core.22682 > > From yesterday: > [root at psanaoss214 /]# ls -al core* > -rw------- 1 root root 4362727424 Jun 8 00:58 core.13483 > -rw------- 1 root root 4624773120 Jun 8 03:21 core.8792 > > > > On 06/08/2012 04:34 PM, Anand Avati wrote: > > Is it possible the system was running low on memory? I see you have 48GB, > but memory registration failure typically would be because the system limit > on the number of pinnable pages in RAM was hit. Can you tell us the size of > your core dump files after the crash? > > Avati > > On Fri, Jun 8, 2012 at 4:22 PM, Ling Ho <ling at slac.stanford.edu> wrote: > >> Hello, >> >> I have a brick that crashed twice today, and another different brick that >> crashed just a while a go. >> >> This is what I see in one of the brick logs: >> >> patchset: git://git.gluster.com/glusterfs.git >> patchset: git://git.gluster.com/glusterfs.git >> signal received: 6 >> signal received: 6 >> time of crash: 2012-06-08 15:05:11 >> configuration details: >> argp 1 >> backtrace 1 >> dlfcn 1 >> fdatasync 1 >> libpthread 1 >> llistxattr 1 >> setfsid 1 >> spinlock 1 >> epoll.h 1 >> xattr.h 1 >> st_atim.tv_nsec 1 >> package-string: glusterfs 3.2.6 >> /lib64/libc.so.6[0x34bc032900] >> /lib64/libc.so.6(gsignal+0x35)[0x34bc032885] >> /lib64/libc.so.6(abort+0x175)[0x34bc034065] >> /lib64/libc.so.6[0x34bc06f977] >> /lib64/libc.so.6[0x34bc075296] >> >> /opt/glusterfs/3.2.6/lib64/libglusterfs.so.0(__gf_free+0x44)[0x7f1740ba25e4] >> >> /opt/glusterfs/3.2.6/lib64/libgfrpc.so.0(rpc_transport_destroy+0x47)[0x7f1740956967] >> >> /opt/glusterfs/3.2.6/lib64/libgfrpc.so.0(rpc_transport_unref+0x62)[0x7f1740956a32] >> >> /opt/glusterfs/3.2.6/lib64/glusterfs/3.2.6/rpc-transport/rdma.so(+0xc135)[0x7f173ca27135] >> /lib64/libpthread.so.0[0x34bc8077f1] >> /lib64/libc.so.6(clone+0x6d)[0x34bc0e5ccd] >> --------- >> >> And somewhere before these, there is also >> [2012-06-08 15:05:07.512604] E [rdma.c:198:rdma_new_post] >> 0-rpc-transport/rdma: memory registration failed >> >> I have 48GB of memory on the system: >> >> # free >> total used free shared buffers cached >> Mem: 49416716 34496648 14920068 0 31692 28209612 >> -/+ buffers/cache: 6255344 43161372 >> Swap: 4194296 1740 4192556 <1740%20%20%20%204192556> >> >> # uname -a >> Linux psanaoss213 2.6.32-220.7.1.el6.x86_64 #1 SMP Fri Feb 10 15:22:22 >> EST 2012 x86_64 x86_64 x86_64 GNU/Linux >> >> The server gluster versions is 3.2.6-1. I am using have both rdma clients >> and tcp clients over 10Gb/s network. >> >> Any suggestion what I should look for? >> >> Is there a way to just restart the brick, and not glusterd on the server? >> I have 8 bricks on the server. >> >> Thanks, >> ... >> ling >> >> >> Here's the volume info: >> >> # gluster volume info >> >> Volume Name: ana12 >> Type: Distribute >> Status: Started >> Number of Bricks: 40 >> Transport-type: tcp,rdma >> Bricks: >> Brick1: psanaoss214:/brick1 >> Brick2: psanaoss214:/brick2 >> Brick3: psanaoss214:/brick3 >> Brick4: psanaoss214:/brick4 >> Brick5: psanaoss214:/brick5 >> Brick6: psanaoss214:/brick6 >> Brick7: psanaoss214:/brick7 >> Brick8: psanaoss214:/brick8 >> Brick9: psanaoss211:/brick1 >> Brick10: psanaoss211:/brick2 >> Brick11: psanaoss211:/brick3 >> Brick12: psanaoss211:/brick4 >> Brick13: psanaoss211:/brick5 >> Brick14: psanaoss211:/brick6 >> Brick15: psanaoss211:/brick7 >> Brick16: psanaoss211:/brick8 >> Brick17: psanaoss212:/brick1 >> Brick18: psanaoss212:/brick2 >> Brick19: psanaoss212:/brick3 >> Brick20: psanaoss212:/brick4 >> Brick21: psanaoss212:/brick5 >> Brick22: psanaoss212:/brick6 >> Brick23: psanaoss212:/brick7 >> Brick24: psanaoss212:/brick8 >> Brick25: psanaoss213:/brick1 >> Brick26: psanaoss213:/brick2 >> Brick27: psanaoss213:/brick3 >> Brick28: psanaoss213:/brick4 >> Brick29: psanaoss213:/brick5 >> Brick30: psanaoss213:/brick6 >> Brick31: psanaoss213:/brick7 >> Brick32: psanaoss213:/brick8 >> Brick33: psanaoss215:/brick1 >> Brick34: psanaoss215:/brick2 >> Brick35: psanaoss215:/brick4 >> Brick36: psanaoss215:/brick5 >> Brick37: psanaoss215:/brick7 >> Brick38: psanaoss215:/brick8 >> Brick39: psanaoss215:/brick3 >> Brick40: psanaoss215:/brick6 >> Options Reconfigured: >> performance.io-thread-count: 16 >> performance.write-behind-window-size: 16MB >> performance.cache-size: 1GB >> nfs.disable: on >> performance.cache-refresh-timeout: 1 >> network.ping-timeout: 42 >> performance.cache-max-file-size: 1PB >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20120608/1698655e/attachment.htm>