On Tue, Mar 7, 2017 at 10:57 AM, Ian Jackson <ijackson@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Also I think you need to specify how abbreviated object names are > interpreted. One option might be to not use hex for the new hash, but base64 encoding. That would make the full size ASCII hash encoding length roughly similar (43 base64 characters rather than 40), which would offset some of the new costs (longer filenames in the loose format, for example). Also, since 256 isn't evenly divisible by 6, and because you'd want some way to explictly disambiguate the new hashes, the rule *could* be that the ASCII representation of a new hash is the base64 encoding of the 258-bit value that has "10" prepended to it as padding. That way the first character of the hash would be guaranteed to not be a hex digit, because it would be in the range [g-v] (indexes 32..47). Of course, the downside is that base64 encoded hashes can also end up looking very much like real words, and now case would matter too. The "use base64 with a "10" two-bit padding prepended" also means that the natural loose format radix format would remain the first 2 characters of the hash, but due to the first character containing the padding, it would be a fan-out of 2**10 rather than 2**12. Of course, having written that, I now realize how it would cause problems for the usual shit-for-brains case-insensitive filesystems. So I guess base64 encoding doesn't work well for that reason. Linus