Jules, When a frame hits its time-out 'rpc/rpc-lib/src/rpc-clnt.c:138:call_bail (void *data)' is triggered. When the client observes a network disconnection (ping-timer-expiry etc) it triggers 'rpc/rpc-lib/src/rpc-clnt.c:341:saved_frames_unwind (struct saved_frames *saved_frames)'. When a node goes down, ping timer will expire and then the frames are unwound in at max ~42 seconds. So in VM scenario it wont hang for 30 minutes. To answer your actual question, why such a big frame timeout: Afr takes entry-locks while performing self-heals, which block other entry fops like create, delete etc. The timeout is put sufficiently large to succeed the entry operations. Afr used to take a lock on entire file to perform data-self-heal on a regular file, we managed to remove that. We are working on doing the same for entry-self-heal. Once that happens we will be in a good position to change these to lower values. Pranith. ----- Original Message ----- From: "Jules Wang" <lancelotds@xxxxxxx> To: "devel" <gluster-devel@xxxxxxxxxx> Sent: Wednesday, August 1, 2012 1:55:47 PM Subject: question on time-out parameters hi, all When I was tracking the bug https://bugzilla.redhat.com/show_bug.cgi?id=794699 I noticed that the default value of "ping-timeout" was 42 and the default value of "frame-timeout" was 1800(30 minutes) (in xlators/protocol/client/src/client.c) When a node is down(ex. powered off), the volume will be out-of-service for a long time. If there is a vm run on the volume, it will probably get crush. So I wonder why we set large number to these parameters? Best Regards. Jules Wang _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxx https://lists.nongnu.org/mailman/listinfo/gluster-devel