Hi A question on self heal: As I understand, when a lookup occurs, the client checks if self heal must be done, it heals if required, the proceed with the lookup. I encounter rare situation where self heal is done but I still get the non healed-result. For instance, I do read a file, get no result as if it were empty, then attempt to read it again and get the correct file content. Here is an example. I am building in a release-3.3 glusterfs volume, and the build fails because of an empty Makefile. The client log shows that this is a replication problem: includes ===> external/intel-fw-eula/ipw2100 nbmake: don't know how to make includes. Stop client log: [2012-07-30 10:09:54.756766] E [afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler] 0-pfs-replicate-0: path /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100 on subvolume pfs-client-1 => -1 (No such file or directory) [2012-07-30 10:09:55.056577] I [afr-common.c:1340:afr_launch_self_heal] 0-pfs-replicate-0: entry self-heal triggered. path: /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100, reason: checksums of directory differ [2012-07-30 10:09:55.062865] E [afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler] 0-pfs-replicate-0: path /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/CVS on subvolume pfs-client-1 => -1 (No such file or directory) [2012-07-30 10:09:55.063069] E [afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler] 0-pfs-replicate-0: path /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile on subvolume pfs-client-1 => -1 (No such file or directory) [2012-07-30 10:09:55.063268] E [afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler] 0-pfs-replicate-0: path /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/dist on subvolume pfs-client-1 => -1 (No such file or directory) [2012-07-30 10:09:55.480500] I [afr-self-heal-common.c:2159:afr_self_heal_completion_cbk] 0-pfs-replicate-0: background entry self-heal completed on /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100 And if I run ls -l the file will finally be healed: $ ls -l external/intel-fw-eula/ipw2100/Makefile -rw-r--r-- 1 manu manu 224 Oct 30 2008 external/intel-fw-eula/ipw2100/Makefile client log: [2012-07-30 14:30:05.058560] I [afr-common.c:1340:afr_launch_self_heal] 0-pfs-replicate-0: background meta-data self-heal triggered. path: /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100, reason: lookup detected pending operations [2012-07-30 14:30:05.086289] I [afr-self-heal-common.c:2159:afr_self_heal_completion_cbk] 0-pfs-replicate-0: background meta-data self-heal completed on /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100 [2012-07-30 14:30:05.527602] I [afr-common.c:1189:afr_detect_self_heal_by_iatt] 0-pfs-replicate-0: size differs for /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile [2012-07-30 14:30:05.527655] I [afr-common.c:1340:afr_launch_self_heal] 0-pfs-replicate-0: background meta-data data self-heal triggered. path: /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile, reason: lookup detected pending operations [2012-07-30 14:30:05.580709] I [afr-self-heal-algorithm.c:116:sh_loop_driver_done] 0-pfs-replicate-0: full self-heal completed on /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile [2012-07-30 14:30:05.615283] I [afr-self-heal-common.c:2159:afr_self_heal_completion_cbk] 0-pfs-replicate-0: background meta-data data self-heal completed on /manu/netbsd/usr/src/external/intel-fw-eula/ipw2100/Makefile This is a bug, right? -- Emmanuel Dreyfus manu@xxxxxxxxxx