Brent, can you post your entire log file (especially all the frame: lines)? maybe you could bzip2 -9 it and mail it to me, or post it on your http from where I can download it.. thanks, avati 2008/6/13 Brent A Nelson <brent@xxxxxxxxxxxx>: > On Fri, 13 Jun 2008, Anand Avati wrote: > > Indeed it did. However, when doing multiple 10 GB dd writes on my four >>> client/servers, 2 of my clients died (the other two finished their tasks >>> without issue and stayed operational). There were a large number of >>> "frame >>> :" lines in glusterfs.log, and the log ended with: >>> >>> frame : type(2) op(0) >>> frame : type(2) op(0) >>> >>> 2008-06-11 20:43:20 C [common-utils.c:155:gf_print_bytes] : xfer == >>> 142359133061, rcvd == 130379872 >>> [0xb7fb4420] >>> /lib/tls/i686/cmov/libc.so.6(abort+0x101)[0xb7e4ea01] >>> /usr/lib/glusterfs/1.4.0qa19/xlator/mount/fuse.so[0xb758dd84] >>> /lib/tls/i686/cmov/libpthread.so.0[0xb7f764fb] >>> /lib/tls/i686/cmov/libc.so.6(clone+0x5e)[0xb7ef8e5e] >>> --------- >>> >>> >> did the client memory usage grow very high before it crashed? can you tell >> us the size of the core dump file? also is it possible to get a gdb >> backtrace on the coredump? this will help us a lot. >> >> > I didn't think to look for a core dump. The client must have become huge, > as the core dump was 3GB. Unfortunately, I didn't compile with debugging; > just in case, here's the bt, anyway: > > #0 0xb7fb4410 in __kernel_vsyscall () > #1 0xb7e4d085 in raise () from /lib/tls/i686/cmov/libc.so.6 > #2 0xb7e4ea01 in abort () from /lib/tls/i686/cmov/libc.so.6 > #3 0xb758dd84 in ?? () from > /usr/lib/glusterfs/1.4.0qa19/xlator/mount/fuse.so > #4 0x00000001 in ?? () > #5 0x00020040 in ?? () > #6 0x00021000 in ?? () > #7 0x08056de0 in ?? () > #8 0x00000000 in ?? () > > If there aren't enough clues, I can compile with debugging and try to crash > it again. > > Thanks, > > Brent > -- If I traveled to the end of the rainbow As Dame Fortune did intend, Murphy would be there to tell me The pot's at the other end.