Constantin, The random 'disconnection' is a fixed issue and the fix is available in the checkout with 'tla get glusterfs--mainline--2.4'. W.R.T to running databases over glusterfs, we have not performed extensive analysis of this scenario. As a crude analysis, i would presume databases would open the files with O_DIRECTIO which disables all performance enhancements. I am sure glusterfs is pretty much untuned for running databases over it at this stage, though it should 'work'. With some feedback from the users who use glusterfs for databases it would be possible to tune it further to give best performance. GlusterFS aims at becoming 'best suited' for all types of applications, since it is easy with the translator design (you could load an alternative translator which has code tuned for delivering best performance for that application) and database is definitely one of the appliations we are looking at in the long run. thanks! avati 2007/6/24, Constantin Teodorescu <teo@xxxxxxx>:
Hi all, first of all, glusterfs is a nice idea, liked it a lot, but it need to be a rock-solid product so I made some tests , hope that this bug-report will help you. Configuration: 3 identical servers and a client, connected with TCP/IP Servers: Mandrake 9.x (compiled, installed without problem) , client CentOS-5 x86_64, fuse 2.6.5 compiled, installed from source, everything went OK. Those 3 servers provide 3 simple bricks joined at the client in full mirror ( afr x 3) configuration + read-ahead & writebehind translators. I made a PostgreSQL tablespace (zone) on the mounted /mnt/gfs and copied there a 50 Mb table then stress-it with various operations. 10 read and full updates on every row in the table succeeded. After a while, a simple "vacuum full analyze" gave the error: glu=# vacuum full analyze; ERROR: could not read block 43155 of relation 527933664/527933665/527933666: Transport endpoint is not connected I repeated the tests many times, after 2,3 minutes of operation, I got the same error, in another place but mostly in WRITE operations. I deleted the whole mounted client disk and rebuild it with STRIPE option instead of AFR translator. The behaviour is the same ... after a couple of succeded operations, I got a failure. That were the facts, now ... the logs and configuration files. The client debug log shows this errors: [1:48:22] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport [1:48:22] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any) [1:48:22] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport [1:48:22] [ERROR/common-utils.c:110/full_rwv()] libglusterfs:full_rwv: 6689 bytes r/w instead of 8539 (Broken pipe) [1:48:22] [ERROR/client-protocol.c:204/client_protocol_xfer()] protocol/client:transport_submit failed [1:48:22] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any) [1:48:22] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport ... ... [1:48:22] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any) [1:48:22] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport [1:48:22] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any) [1:48:22] [DEBUG/client-protocol.c:2708/client_protocol_interpret()] protocol/client:frame not found for blk with callid: 139893 [1:48:22] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x16595730 [1:48:22] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:client1: connection to server disconnected [1:48:22] [CRITICAL/common-utils.c:215/gf_print_trace()] debug-backtrace:Got signal (11), printing backtrace [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(gf_print_trace+0x21) [0x2aaaaaccf4a1] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/lib64/libc.so.6 [0x2aaaab53b070] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/read- ahead.so(ra_frame_return+0x142) [0x2aaaac4da2a2] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/read- ahead.so [0x2aaaac4d9daa] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[glusterfs] [0x40910b] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib64/libfuse.so.2 [0x2aaaaaee3059] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[glusterfs] [0x402f29] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(sys_epoll_iteration+0xd4) [0x2aaaaacd0ef4] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[glusterfs] [0x402898] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/lib64/libc.so.6(__libc_start_main+0xf4) [0x2aaaab5288a4] [1:48:22] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[glusterfs] [0x4025f9] The server configuration files are like that : ------------------- volume brick type storage/posix option directory /var/gldata end-volume volume server type protocol/server option transport-type tcp/server option listen-port 6996 option bind-address 29.11.276.x subvolumes brick option auth.ip.brick.allow * end-volume ------------------- The client configuration file is : ------------------- volume clientX #{1,2,3} type protocol/client option transport-type tcp/client option remote-host A.B.C.X option remote-port 6996 option remote-subvolume brick end-volume ### Add AFR feature to brick volume afr type cluster/afr subvolumes client1 client2 client3 option replicate *:3 # All files 3 copies end-volume #volume stripe # type cluster/stripe # subvolumes client1 client2 client3 # option block-size *:256kB #end-volume #volume trace # type debug/trace # subvolumes afr # option debug on #end-volume volume writebehind type performance/write-behind option aggregate-size 131072 # aggregate block size in bytes subvolumes afr end-volume volume readahead type performance/read-ahead option page-size 131072 ### size in bytes option page-count 16 ### page-size x page-count is the amount of read-ahead data per file subvolumes writebehind end-volume ------------------- _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxx http://lists.nongnu.org/mailman/listinfo/gluster-devel
-- Anand V. Avati