On Sat, Feb 25, 2017 at 06:50:50PM +0000, brian m. carlson wrote: > > As long as the reader can tell from the format of object names > > stored in the "new object format" object from what era is being > > referred to in some way [*1*], we can name new objects with only new > > hash, I would think. "new refers only to new" that stratifies > > objects into older and newer may make things simpler, but I am not > > convinced yet that it would give our users a smooth enough > > transition path (but I am open to be educated and pursuaded the > > other way). > > I would simply use multihash[0] for this purpose. New-style objects > serialize data in multihash format, so it's immediately obvious what > hash we're referring to. That makes future transitions less > problematic. > > [0] https://github.com/multiformats/multihash I looked at that earlier, because I think it's a reasonable idea for future-proofing. The first byte is a "varint", but I couldn't find where they defined that format. The closest I could find is: https://github.com/multiformats/unsigned-varint whose README says: This unsigned varint (VARiable INTeger) format is for the use in all the multiformats. - We have not yet decided on a format yet. When we do, this readme will be updated. - We have time. All multiformats are far from requiring this varint. which is not exactly confidence inspiring. They also put the length at the front of the hash. That's probably convenient if you're parsing an unknown set of hashes, but I'm not sure it's helpful inside Git objects. And there's an incentive to minimize header data at the front of a hash, because every byte is one more byte that every single hash will collide over, and people will have to type when passing hashes to "git show", etc. I'd almost rather use something _really_ verbose like sha256:1234abcd... in all of the objects. And then when we get an unadorned hash from the user, we guess it's sha256 (or whatever), and fallback to treating it as a sha1. Using a syntactically-obvious name like that also solves one other problem: there are sha1 hashes whose first bytes will encode as a "this is sha256" multihash, creating some ambiguity. -Peff