Junio C Hamano <gitster@xxxxxxxxx> writes: > "Eric W. Biederman" <ebiederm@xxxxxxxxx> writes: > >> As far as I can tell this extra pass defeats most of the purpose of >> streaming, and it is much easier to implement with in memory buffers. > > The purpose of streaming being the ability to hash and compute the > object name without having to hold the entirety of the object, I am > not sure the above is a good argument. You can run multiple passes > by streaming the same data twice if you needed to, and how much > easier the implementation may become if you can assume that you can > hold everything in-core, what you cannot fit in-core would not fit > in-core, so ... Yes this wording needs to be clarified. If streaming to handle objects that don't fit in memory is the purpose, I agree there are slow multi-pass ways to deal with trees, commits and tags. If writing directly to the pack is the purpose, using an in-core buffer for trees, commits, and tags is better. I will put on the wording on the back burner and see what I come up with. >> So if it is needed to write commits, trees, and tags directly to pack >> files writing a separate function to do the would be needed. > > But I am OK with this conclusion. As the way to compute the > fallback hashes for different types of objects are very different, > compared to a single-hash world where as long as you come up with a > serialization you have only a single way to hash and name the > object. We would end up having separate helper functions per target > type anyway, even if we kept a single entry point function like > index_stream(). The single entry point function will only be used > to just dispatch to type specific ones, so renaming what we have today > and making it clear they are for "blobs" does make sense. Good. I am glad I am able to step back and successfully explain the whys of things. Eric