read tests passed but backup crashed brick and client Here is backtrace from brick that crashed: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1269179504 (LWP 30452)] inode_forget (inode=0x8064038, nlookup=0) at list.h:92 92 prev->next = next; (gdb) bt #0 inode_forget (inode=0x8064038, nlookup=0) at list.h:92 #1 0xb75c0d0a in posix_forget () from /usr/lib/glusterfs/1.3.0-pre5/xlator/storage/posix.so #2 0xb75b5676 in iot_forget_wrapper () from /usr/lib/glusterfs/1.3.0-pre5/xlator/performance/io-threads.so #3 0xb7f44f4a in call_resume_wind (stub=0x8064038) at call-stub.c:2027 #4 0xb7f44fd7 in call_resume (stub=0x810bfd8) at call-stub.c:2763 #5 0xb75b97a5 in iot_worker () from /usr/lib/glusterfs/1.3.0-pre5/xlator/performance/io-threads.so #6 0xb7f153db in start_thread () from /lib/libpthread.so.0 #7 0xb7e9f26e in clone () from /lib/libc.so.6 Harris ----- Original Message ----- From: "Basavanagowda Kanur" <gowda@xxxxxxxxxxxxx> To: "Harris Landgarten" <harrisl@xxxxxxxxxxxxx> Cc: "Anand Avati" <avati@xxxxxxxxxxxxx>, "gluster-devel" <gluster-devel@xxxxxxxxxx> Sent: Friday, June 29, 2007 9:36:17 AM (GMT-0500) America/New_York Subject: Re: brick crash/hang with io-threads in 2.5 patch 240 Harris, Please find the fix for the bug in patch-243. Thanks, gowda On 6/28/07 , Harris Landgarten < harrisl@xxxxxxxxxxxxx > wrote: Avati, I managed to get a bt from the server by attaching to the process with gdb 0xb7f60f38 in dict_set (this=0x8056fc8, key=0xb75d8fa3 "key", value=0x8056c90) at dict.c:124 124 for (pair = this->members[hashval]; pair != NULL; pair = pair->hash_next) { (gdb) bt #0 0xb7f60f38 in dict_set (this=0x8056fc8, key=0xb75d8fa3 "key", value=0x8056c90) at dict.c:124 #1 0xb75cf36b in server_getxattr_cbk () from /usr/lib/glusterfs/1.3.0-pre5/xlator/protocol/server.so #2 0xb7f64d55 in default_getxattr_cbk (frame=0x8057228, cookie=0x8057740, this=0x804ffc0, op_ret=0, op_errno=13, dict=0x8056fc8) at defaults.c:1071 #3 0xb7f6d462 in call_resume (stub=0x8056858) at call-stub.c:2469 #4 0xb75e1770 in iot_reply () from /usr/lib/glusterfs/1.3.0-pre5/xlator/performance/io-threads.so #5 0xb7f3d3db in start_thread () from /lib/libpthread.so.0 #6 0xb7ec726e in clone () from /lib/libc.so.6 I hope this helps. Have you been able to reproduce? Harris ----- Original Message ----- From: "Anand Avati" < avati@xxxxxxxxxxxxx > To: "Harris Landgarten" < harrisl@xxxxxxxxxxxxx > Cc: "gluster-devel" < gluster-devel@xxxxxxxxxx > Sent: Wednesday, June 27, 2007 8:09:13 AM (GMT-0500) America/New_York Subject: Re: brick crash/hang with io-threads in 2.5 patch 240 is there a bactrace of the server available too? it would be of great help.. thanks, avati 2007/6/27 , Harris Landgarten < harrisl@xxxxxxxxxxxxx >: Whenever I enable io-threads in one of my bricks I can cause a crash in client1: ls -lR /mnt/glusterfs while this is running in client2: ls -l /mnt/glusterfs ls: /mnt/glusterfs/secondary: Transport endpoint is not connected total 4 ?--------- ? ? ? ? ? /mnt/glusterfs/backups ?--------- ? ? ? ? ? /mnt/glusterfs/tmp At this point the brick with io-threads has crashed: 2007-06-27 07:45:55 C [common-utils.c:205:gf_print_trace] debug-backtrace: Got signal (11), printing backtrace 2007-06-27 07:45:55 C [common-utils.c:207:gf_print_trace] debug-backtrace: /usr/lib/libglusterfs.so.0(gf_print_trace+0x2d) [0xb7fabd4d] 2007-06-27 07:45:55 C [common-utils.c:207:gf_print_trace] debug-backtrace: [0xbfffe420] 2007-06-27 07:45:55 C [common-utils.c:207:gf_print_trace] debug-backtrace: /usr/lib/glusterfs/1.3.0-pre5/xlator/protocol/server.so [0xb761436b] 2007-06-27 07:45:55 C [common-utils.c:207:gf_print_trace] debug-backtrace: /usr/lib/libglusterfs.so.0 [0xb7fa9d55] 2007-06-27 07:45:55 C [common-utils.c:207:gf_print_trace] debug-backtrace: /usr/lib/libglusterfs.so.0(call_resume+0x4f2) [0xb7fb2462] 2007-06-27 07:45:55 C [common-utils.c:207:gf_print_trace] debug-backtrace: /usr/lib/glusterfs/1.3.0-pre5/xlator/performance/io- threads.so [0xb7626770] 2007-06-27 07:45:55 C [common-utils.c:207:gf_print_trace] debug-backtrace: /lib/libpthread.so.0 [0xb7f823db] 2007-06-27 07:45:55 C [common-utils.c:207:gf_print_trace] debug-backtrace: /lib/libc.so.6(clone+0x5e) [0xb7f0c26 The bricks is running on fedora and it doesn't want to generate a core. Any suggestions? This is the spec file I used for the test ### Export volume "brick" with the contents of "/home/export" directory. volume posix1 type storage/posix # POSIX FS translator option directory /mnt/export # Export this directory end-volume volume io-threads type performance/io-threads option thread-count 8 subvolumes posix1 end-volume ### Add POSIX record locking support to the storage brick volume brick type features/posix-locks option mandatory on # enables mandatory locking on all files subvolumes io-threads end-volume ### Add network serving capability to above brick. volume server type protocol/server option transport-type tcp/server # For TCP/IP transport # option transport-type ib-sdp/server # For Infiniband transport # option bind-address 192.168.1.10 # Default is to listen on all interfaces option listen-port 6996 # Default is 6996 # option client-volume-filename /etc/glusterfs/glusterfs- client.vol subvolumes brick # NOTE: Access to any volume through protocol/server is denied by # default. You need to explicitly grant access through "auth" option. option auth.ip.brick.allow * # access to "brick" volume end-volume _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxx http://lists.nongnu.org/mailman/listinfo/gluster-devel -- Anand V. Avati _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxx http://lists.nongnu.org/mailman/listinfo/gluster-devel