On Fri, Jun 09, 2017 at 12:23:24PM -0700, Jonathan Tan wrote: > Looking at the 3 primary functions (sha1_object_info_extended, > read_object, has_sha1_file_with_flags), they independently implement > mechanisms such as object replacement, retrying the packed store after > failing to find the object in the packed store then the loose store, and > being able to mark a packed object as bad and then retrying the whole > process. Consolidating these mechanisms would be a great help to > maintainability. > > Such a consolidated function would need to handle the read_object() case > (which returns the object data, type, and size) and the > sha1_object_info_extended() case (which returns the object type, size, > and some additional information, not all of which can be "turned off" by > passing NULL in "struct object_info"). I like the idea of consolidating the logic. But I can't help but feel that pulling these fields out of object_info is a step backwards. The point of that struct is to let the caller specify which aspects of the object they're interested in, and let the lookup function decide how best to optimize the query. So it seems like places which actually want to read the object should be passing in a new field in the object_info for "yes, I actually want the object contents, too", and then the consolidated function can decide which approach to take based on whether or not the contents are requested (e.g., unpacking the whole thing, or just the header). If a caller asks for the contents but not the size, that's OK. We'd find the size incidentally while unpacking the contents, but just not include it in the returned object_info. Another approach to this whole mess is to have a single function for acquiring a "handle" to an object, along with functions to query aspects of a handle. That would let callers iteratively ask for the parts they care about, and we could lazily fill the handle info (so information we pick up while servicing one property of the object gets cached and returned for free if the caller asks for it later). That's a much bigger change, though it may have other benefits (e.g., we could be passing around handles instead of object buffers, which would make it more natural to stream object content in many cases). -Peff