On 02/25/2016 08:20 PM, Ravishankar N wrote: > On 02/25/2016 11:36 PM, Kyle Maas wrote: >> How can I tell what AFR version a cluster is using for self-heal? > If all your servers and clients are 3.7.8, then they are by default > running afr-v2. Afr-v2 was a re-write of afr that went in for 3.6., > so any gluster package from then on has this code, you don't need to > explicitly enable anything. That was what I thought until I ran across this IRC log where JoeJulian asked if it was explicitly enabled: https://irclog.perlgeek.de/gluster/2015-10-29 >> >> The reason I ask is that I have a two-node replicated 3.7.8 cluster (no >> arbiters) which has locking behavior during self-heal which looks very >> similar to that of AFRv1 (only heals one file at a time per self-heal >> daemon, appears to lock the full inode while it's healing it instead of >> just ranges, etc.), > Both v1 and v2 use range locks while healing a given file, so clients > shouldn't block when heals happen. What is the problem you're facing? > Are your clients also at 3.7.8? Primary symptoms are: 1. While a self-heal is running, only one file at a time is healed per brick. As I understand it, AFRv2 and up should allow for multiple files to be healed concurrently or at least multiple ranges within a file, particularly with io-thread-count set to >1. During a self-heal, neither I/O nor network is saturated, which leads me to believe that I'm looking at a single synchronous self-healing process. 3. More troubling is that during a self-heal, clients cannot so much as list the files on the volume until the self-heal is done. No errors. No timeouts. They just freeze. As soon as the self-heal is complete, they unfreeze and list the contents. 4. Any file access during a self-heal also freezes, just like a directory listing, until the self-heal is done. This wreaks havoc on users who have files open when one of the bricks is rebooted and has to be healed, since with as much data is stored on this cluster, a self-heal can take almost 24 hours. I experience the same problems when I run without any clients other than the bricks themselves mounting the volume, so yes, it happens with the clients on 3.7.8 as well. Warm Regards, Kyle Maas _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users