On 3/13/2012 3:53 PM, Ted Ts'o wrote:
Because that would be a format change.
I think a format change would be preferable to runtime sorting.
What we have today is not a hash table; it's a hashed tree, where we use a fixed-length key for the tree based on the hash of the file name. Currently the leaf nodes of the tree are the directory blocks themselves; that is, the lowest level of the index blocks tells you to look at directory block N, where that directory contains the directory indexes for those file names which are in a particular range (say, between 0x2325777A and 0x2325801).
So the index nodes contain the hash ranges for the leaf block, but the leaf block only contains the regular directory entries, not a hash for each name? That would mean that adding or removing names would require moving around the regular directory entries wouldn't it?
If we aren't going to change the ordering of the directory directory, that means we would need to change things so the leaf nodes contain the actual directory file names themselves, so that we know whether or not we've hit the correct entry or not before we go to read in a specific directory block (otherwise, you'd have problems dealing with hash collisions). But in that case, instead of storing the pointer to the directory entry, since the bulk of the size of a directory entry is the filename itself, you might as well store the inode number in the tree itself, and be done with it.
I would think that hash collisions are rare enough that reading a directory block you end up not needing once in a blue moon would be chalked up under "who cares". So just stick with hash, offset pairs to map the hash to the normal directory entry.
-- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html