Hi, After a month's file operations, which included coping 20 million of small files and about 20 thousand of cluster jobs, I am overall satisfied except two stability glitches. 1. A small portion (about 1%?) of jobs got an error of "transport endpoint not connected", and output file is incomplete. This error happened on random computing nodes, and it doesn't affect subsequent jobs on the same node. An example of error message of glusterfsd is 2008-11-19 23:09:51 E [protocol.c:271:gf_block_unserialize_transport] server: EOF from peer (172.20.102.2:1022) Error of glusterfs is either (looks to be caused by brick) 2008-11-19 23:09:52 C [client-protocol.c:212:call_bail] muskie-brick: bailing transport 2008-11-19 23:09:52 E [client-protocol.c:4834:client_protocol_cleanup] muskie-brick: forced unwinding frame type(1) op(14) reply=@0x67e2150 2008-11-19 23:09:52 E [client-protocol.c:3254:client_write_cbk] muskie-brick: no proper reply from server, returning ENOTCONN 2008-11-19 23:09:56 E [write-behind.c:602:wb_writev] wb: delayed error : 107 or (caused by namespace) 2008-11-28 20:47:53 C [client-protocol.c:212:call_bail] muskie-ns: bailing transport 2008-11-28 20:47:53 E [client-protocol.c:4834:client_protocol_cleanup] muskie-ns: forced unwinding frame type(1) op(40) reply=@0x1b447cc0 2008-11-28 20:47:53 E [client-protocol.c:4613:client_checksum_cbk] muskie-ns: no proper reply from server, returning ENOTCONN 2008-11-28 20:47:53 E [client-protocol.c:325:client_protocol_xfer] muskie-ns: transport_submit failed 2. Right now the process 'glusterfs' takes 1785M virt mem, and 1500 RES mem, according to top. I hope this is not a memory leak, or at least there should be a way to reduce memory usage without remounting it. If somebody can shed some light on these issues, I appreciate it. Just let me know if you need more detailed information. Best, Manhong