> Can you tell me please why every volume rebalance generates a new value > for the volume commit hash? > > If I have fully rebalanced cluster (or almost) with millions of > directories then rebalance has to change DHT xattr for every directory > only because there is a new volume commit hash value. It is pointless in > my opinion. Is there any reason behind this? As I observed, the volume > commit hash is set at the rebalance beginning which totally destroys > benefit of lookup optimization algorithm for directories not > scanned/fixed yet by this rebalance run. It disables the optimization because the optimization would no longer lead to correct results. There are plenty of distributed filesystems that seem to have "fast but wrong" as a primary design goal; we're not one of them. The best way to think of the volume-commit-hash update is as a kind of cache invalidation. Lookup optimization is only valid as long as we know that the actual distribution of files within a directory is consistent with the current volume topology. That ceases to be the case as soon as we add or remove a brick, leaving us with three choices. (1) Don't do lookup optimization at all. *Every* time we fail to find a file on the brick where hashing says it should be, look *everywhere* else. That's how things used to work, and still work if lookup optimization is disabled. The drawback is that every add/remove brick operation causes a permanent and irreversible degradation of lookup performance. Even on a freshly created volume, lookups for files that don't exist anywhere will cause every brick to be queried. (2) Mark every directory as "unoptimized" at the very beginning of rebalance. Besides being almost as slow as fix-layout itself, this would require blocking all lookups and other directory operations *anywhere in the volume* while it completes. (3) Change the volume commit hash, effectively marking every directory as unoptimized without actually having to touch every one. The root-directory operation is cheap and almost instantaneous. Checking each directory commit hash isn't free, but it's still a lot better than (1) above. With upcalls we can enhance this even further. Now that you know a bit more about the tradeoffs, do "pointless" and "destroys the benefit" still seem accurate? _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users