On Sat, 10 Jun 2017 03:01:33 -0400 Jeff King <peff@xxxxxxxx> wrote: > On Fri, Jun 09, 2017 at 12:23:24PM -0700, Jonathan Tan wrote: > > > Looking at the 3 primary functions (sha1_object_info_extended, > > read_object, has_sha1_file_with_flags), they independently implement > > mechanisms such as object replacement, retrying the packed store after > > failing to find the object in the packed store then the loose store, and > > being able to mark a packed object as bad and then retrying the whole > > process. Consolidating these mechanisms would be a great help to > > maintainability. > > > > Such a consolidated function would need to handle the read_object() case > > (which returns the object data, type, and size) and the > > sha1_object_info_extended() case (which returns the object type, size, > > and some additional information, not all of which can be "turned off" by > > passing NULL in "struct object_info"). > > I like the idea of consolidating the logic. But I can't help but feel > that pulling these fields out of object_info is a step backwards. The > point of that struct is to let the caller specify which aspects of the > object they're interested in My issue was that there are some parts that cannot be turned off (in particular, the object_info.u.packed part). Having said that, reading the packed object itself should give us enough information to populate that, so I'll take a look and see if this is possible. > Another approach to this whole mess is to have a single function for > acquiring a "handle" to an object, along with functions to query aspects > of a handle. That would let callers iteratively ask for the parts they > care about, and we could lazily fill the handle info (so information we > pick up while servicing one property of the object gets cached and > returned for free if the caller asks for it later). > > That's a much bigger change, though it may have other benefits (e.g., we > could be passing around handles instead of object buffers, which would > make it more natural to stream object content in many cases). There are a few safeguards that, I think, only work with the current get-everything-then-forget-about-it approach (the packed-loose-packed retry mechanism, and the desperate retry-if-corrupt-packed-object one). If we have a handle with a cache, then, for example, we would lose the ability to retry packed after loose if the handle has already declared that the object is loose.