On Sun, Apr 05, 2015 at 03:41:39PM +0200, René Scharfe wrote: > > The main culprits seem to be d0f810f (which introduced some extra > > expensive code for each ref) and my 10c497a, which switched from fgets() > > to strbuf_getwholeline. It turns out that strbuf_getwholeline is really > > slow. > > 10c497a changed read_packed_refs(), which reads *all* packed refs. > Each is checked for validity. That sounds expensive if the goal is > just to look up a single (non-existing) ref. > > Would it help to defer any checks until a ref is actually accessed? > Can a binary search be used instead of reading the whole file? Yes, but addressing that is much more invasive. Right now we parse all of the packed-refs file into an in-memory cache, and then do single lookups from that cache. Doing an mmap() and a binary search is way faster (and costs less memory) for doing individual lookups. It relies on the list being sorted. This is generally true, but not something we currently rely on (however, it would be easy to add a "sorted" flag to top of the file and have the readers fall back when the flag is missing). I've played with a patch to do this (it's not entirely trivial, because you jump into the middle of a line, and then have to walk backwards to find the start of the record). For traversals, it's more complicated. Obviously if you are traversing all refs, you have to read the whole thing anyway. If you are traversing a subset of the refs, you can binary-search the start of the subset, and then walk forward. But that's where it gets tricky with the current code. The ref_cache code expects to fill in from outer to inner. So if you have "refs/foo", you should also have filled in all of "refs/" (but not necessarily "refs/bar"). This matches the way we traverse loose ref directories; we opendir "refs/", find out that it has "foo" and "bar", and the descend into "foo", and so forth. But reading a subset of the packed-ref file is "inside out". You fill in all of "refs/foo", but you have no idea what else is in "refs/". So going in that direction would involve some surgery to the ref_cache code. It might even involve throwing it out entirely (i.e., just mmap the packed-refs file and look through it directly, without any kind of in-memory cache; we don't tend to do more than one ref-iteration per program anyway, so I'm not sure the caching is buying us much anyway). My big concern there would be that there are a lot of subtle race issues between packed and loose refs, and the current state is the result of a lot of tweaking. I'd be worried that a heavy rewrite there would risk introducing subtle and rare corruptions. Plus it would be a lot of work, which leads me to... > I wonder if pluggable reference backends could help here. Storing refs > in a database table indexed by refname should simplify things. ...this. I think that effort might be better spent on a ref storage format that's more efficient, simpler (with respect to subtle races and such), and could provide other features (e.g., transactional atomicity). The big plus side of packed-refs improvements is that they "just work" without worrying about compatibility issues. But ref storage is local, so I'm not sure how big a deal that is in practice. > Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping > the packed refs file? What numbers do you get with the following patch? It's about 9% faster than my series + the fgets optimization I posted (or about 25% than using getc). Which is certainly nice, but I was really hoping to just make strbuf_getline faster for all callers, rather than introducing special code for one call-site. Certainly we could generalize the technique (i.e., a struct with the mmap data), but then I feel we are somewhat reinventing stdio. Which is maybe a good thing, because stdio has a lot of rough edges (as seen here), but it does feel a bit like NIH syndrome. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html