URL: <http://savannah.nongnu.org/bugs/?20200> Summary: Segfault on file not found Project: Gluster Submitted by: hook Submitted on: Monday 06/18/2007 at 07:08 Category: GlusterFS Severity: 3 - Normal Priority: 5 - Normal Item Group: Crash Status: None Privacy: Public Assigned to: None Open/Closed: Open Discussion Lock: Any Operating System: GNU/Linux _______________________________________________________ Details: When testing glusterFS (glusterfs-1.3.0-pre4) with some traffic from our production servers the system crashed (and started returning 'Transport endpoint not connected') this happens regaully but I cannot track down why. The core file indicates the glusterfs client is seg faulting because a file cannot be found. -- 8<-- output from gdb glusterfs /core.10469: Program terminated with signal 11, Segmentation fault. #0 0xb760531f in ra_frame_return (frame=0x81886e0) at page.c:284 284 page.c: No such file or directory. in page.c (gdb) bt #0 0xb760531f in ra_frame_return (frame=0x81886e0) at page.c:284 #1 0xb7604586 in ra_readv (frame=0x81886e0, this=0x8076368, file_ctx=0x8181520, size=8192, offset=0) at read-ahead.c:412 #2 0xb7fa8964 in default_readv (frame=0x8174c68, this=0x80768c0, fd=0x8181520, size=8192, offset=0) at defaults.c:582 #3 0x0804c089 in fuse_readv (req=0x818a0c8, ino=1187, size=8192, off=0, fi=0xbfb93b5c) at fuse-internals.c:1910 #4 0xb7f947e9 in fuse_reply_err () from /usr/lib/libfuse.so.2 #5 0xb7f95733 in fuse_reply_entry () from /usr/lib/libfuse.so.2 #6 0xb7f96f26 in fuse_session_process () from /usr/lib/libfuse.so.2 #7 0x0804a8c8 in fuse_transport_notify (xl=0x80541f0, trans=0x8054398, event=<value optimized out>) at fuse-bridge.c:312 #8 0xb7faa9bd in transport_notify (this=0x8054398, event=1) at transport.c:148 #9 0xb7fab569 in sys_epoll_iteration (ctx=0xbfb93cdc) at epoll.c:53 #10 0xb7faaa6d in poll_iteration (ctx=0xbfb93cdc) at transport.c:251 #11 0x0804a11b in main (argc=4, argv=0xbfb93db4) at glusterfs.c:326 -- 8<-- All nodes started with an empty directory, and most data was rsynced from an existing email system into the mounted gluster directory. The system uses a 3 bricks, with the following afr setup: server1-brick => server1-mirror (on server2) server2-brick => server2-mirror (on server3) server3-brick => server3-mirror (on server1) it doesn't matter what scheduler is used on the unify brick (although the bt above was created using the nufa scheduler) The main issue here is that the system went dead in this situation. As GlusterFS will be used in Cluster environments where HA is required this is not good. _______________________________________________________ Reply to this item at: <http://savannah.nongnu.org/bugs/?20200> _______________________________________________ Message sent via/by Savannah http://savannah.nongnu.org/