On 07/12/2013 11:22 AM, Jeff King wrote: > Yet another option is to consider what the check is doing, and > accomplish the same thing in a different way. The real pain is that we > are individually trying to resolve each object by hitting the filesystem > (and doing lots of silly checks on the refname format, when we know it > must be valid). > > We don't actually care in this case if the ref list is up to date (we > are not trying to update or read a ref, but only know if it exists, and > raciness is OK). IOW, could we replace the dwim_ref call for the warning > with something that directly queries the ref cache? I think it would be quite practical to add an API something like struct ref_snapshot *get_ref_snapshot(const char *prefix) void release_ref_snapshot(struct ref_snapshot *) int lookup_ref(struct ref_snapshot *, const char *refname, unsigned char *sha1, int *flags) where prefix is the part of the refs tree that you want included in the snapshot (e.g., "refs/heads") and ref_snapshot is probably opaque outside of the refs module. Symbolic refs, which are currently not stored in the ref_cache, would have to be added because otherwise we would have to do all of the lookups anyway. I think this would be a good step to take for many reasons, including because it would be another useful step in the direction of ref transactions. But with particular respect to "git cat-file", I see problems: 1. get_ref_snapshot() would have to read all loose and packed refs within the specified subtree, because loose refs have to be read before packed refs. So the call would be expensive if there are a lot of loose refs. And DWIM wouldn't know in advance where the references might be, so it would have to set prefix="". If many refs are looked up, then it would presumably be worth it. But if only a couple of lookups are done and there are a lot of loose refs, then using a cache would probably slow things down. The slowdown could be ameliorated by adding some more intelligence, for example only populating the loose refs cache after a certain number of lookups have already been done. 2. A "git cat-file --batch" process can be long-lived. What guarantees would users expect regarding its lookup results? Currently, its ref lookups reflect the state of the repo at the moment the commit identifier is written into the pipe. Using a cache like this would mean that ref lookups would always reflect the snapshot taken at the start of the "git cat-file" run, regardless of whether the script using it might have added or modified some references since then. I think this would have to be considered a regression. Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html