> That all said - the GlusterFS representative responded they have chosen to > error on the side of "conservative", where they choose to keep the file if > they cannot find proof that it should be removed, which unintentionally > supports your model. This being the case, it does lead to supporting your > position, as you are also looking for "conservative" behaviour in the case > of an error path during self-heal and backend consistency checks. GlusterFS healing up files with missing extended attributes is one thing. The purpose of this "feature" is so that, in those corner cases where an fsck strips off the xattrs from a few files, those files get reconstructed (the "conservative" approach). If this was all what was necessary for a soft migration, why would we not portray it as a feature? Because there are a few other considerations to be made, which makes this conservative healing different from soft migration. When soft migration is claimed to be supported, the user expects the filesystem to be usable from the moment she mounts it, with pre-existing data present in one backend. The filesystem is expected to be "operational" from that very moment. An initial 'ls -lR' would start recreating the second copy. If the ls -lR goes to completion, the user is all set to go too. That is the best case scenario. Now what if the source node goes down in the middle of the healing? Nothing prevents the user from continuing to use the mountpoint with only the half-healed second server, and make changes which could be conflicting with the remaining not-yet-healed half on the first server? A "supported" soft migration is expected to handle all such situations. The current self heal does not, and might result in split brains in those situations. Which is why we say that using the "conservative" healing as a soft migration "feature" is misusing (rather, stretching beyond limits) the self heal feature. In fact what happened in the original post of this thread is very similar to what I just described. The "soft migration" was interrupted in the middle, changes were done to the second server (infact directly in the backend!) which ended up being incompatible with the first server when everything was brought up. xattrs were obviously inconsistent and ends up with split brains. The right way would have been to just copy everything from the mountpoint. Using self heal this (undocumented) way is really stretching it beyond limits and expecting what was never a claimed feature. Avati