Dear experts, we started to test gluster (3.3.0) on a small cluster grown over some time (~200 cores and ~100 TB in heterogeneous hardware). Our data typically are organised in a tree of directories containing a set of files with always the same filenames, s.a. dir1 |- file1 (100 MB) |- file2 (1 MB) dir2 |- file1 (100 MB) |- file2 (1 MB) dir3 |- file1 (100 MB) |- file2 (1 MB) ... To gain some experience with rebalancing we set up a volume with one brick only, copied some data as described above, added a second brick and started the rebalancing. The result was that all files of a given name ended up on the same brick. In our case, this leads to a very inhomogeneous distribution of data volume since the different types of files have very different sizes. Looking at the implementation in the dht translator and checking calculated hashes it seems that only the basename is used for the hash calculation of a given file. With all directories having the same mappings for the hash intervals to bricks, this would explain our observation if only this file hash is used. However, I also see hashes calculated for directories but it's not clear to me for what they are used? Do I miss something here? Is this behaviour intended? Is there a (supported) way to still distribute the files homogeneously to all bricks? E.g. by using the full path for the hashing (which is actually what I understood from the manual), or by shuffling the hash intervals per directory? There must be other people having many directories containing the same set of files. Any recommendations on how to handle this? Many thanks, Jochen