[Re-titled, as we are off-topic from the original patch series] On Sun, Mar 17, 2013 at 05:38:59PM +0100, René Scharfe wrote: > Am 17.03.2013 06:40, schrieb Jeff King: > >We do have the capability to roll out to one or a few of our servers > >(the granularity is not 0.2%, but it is still small). I'm going to try > >to keep us more in sync with upstream git, but I don't know if I will > >get to the point of ever deploying "master" or "next", even for a small > >portion of the population. We are accumulating more hacks[1] on top of > >git, so it is not just "run master for an hour on this server"; I have > >to actually merge our fork. > > Did you perhaps intend to list these hacks in a footnote or link to a > repository containing them? (I can't find the counterpart of that > [1].) I was actually just going to say "some of which are gross hacks that will never see the light of day, some of which have already gone upstream, and some of which I am planning on submitting upstream". But since I happened to be cataloguing them recently, here is the list of things that have not yet gone upstream. If anybody is interested in a particular topic, I'm happy to discuss and/or prioritize moving it forward. - blame-tree; re-rolled from my submission last year to build on top of the revision machinery, handle merges sanely, etc. Mostly this needs documentation and a clean-up of the output format (which is very utilitarian, but probably should share output with git-blame). - diff --max-depth; this is a requirement to do blame-tree efficiently if you want to do GitHub-style listings (you must recurse to find the history of some/subdir, but you do not want to recurse past that for efficiency reasons). This is hung up on two things: 1. It does not integrate with the pathspec max-depth code, because we do not use struct pathspec in the tree diff (but I think Duy's patches are changing that). 2. My definition of --max-depth is subtly different from that of "git grep". But I think mine is more useful, and I haven't decided how to reconcile it. - share ref selection code between "git branch", "git tag", and "git for-each-ref". This includes cleaning up the "tag --contains" code to be safer for general use (so that "branch --contains" can benefit from the speedup), and then getting the same options for all three commands (tag doesn't know about --merged, and for-each-ref doesn't know about --contains or --merged). - receive.maxsize; index-pack will happily spool data to disk forever, and you never even get a chance to make a policy decision like "hey, this is too big". This patch lets index-pack cut off the client after a certain number of bytes. It's not elegant because the cutoff transfer is not resumable, but we use it is as a last-ditch for DoS protection (the client can reconnect and send more, of course, but at that point we have the opportunity to make external policy decisions like locking their account). Not sure if other sites would want this or not. - receive.advertisealternates; basically turn off ".have" advertisement. Some of our alternates networks are so huge that the cost of collecting all of the alternate refs is very high (even though it can save some transfer bandwidth). Not sure if other sites want this or not (and I think it would be more elegant to have a small static set of common refs that people build off of, and advertise those. e.g., if you fork rails/rails, then we should advertise rails/rails/refs/heads/master as a ".have", but not anybody else's fork). - receive.hiderefs; this is going to become redundant with Junio's implementation - an audit reflog; we keep a reflog for all refs at the root of the repository. It differs from a regular reflog in that: 1. It never expires. 2. It is not part of reachability analysis. 3. It includes the refname for each entry, so you can see deletions. It's mostly useful for forensics when somebody has screwed up their repository (or we're chasing down a git bug; it helped me find the pack-refs race recently). Probably too GitHub-specific for other people to want it (especially because it grows without bound). - statistics instrumentation; we keep counters for various things in code (e.g., which phase of protocol upload-pack is in, how many bytes sent, etc) and expose them in a few ways. One is over a socket to run a "top"-like interface. Another is to tweak the argv array of the process so that "ps" shows the process state. I think it would be useful to other people running git servers, but the code is currently quite nasty and invasive. I have a work-in-progress to clean it up, but it's got a ways to go. - hacks to set niceness and io-priority; this should be done by a wrapper, but in our case it was simpler to catch all processes by just building it into git. Too gross to go upstream. - ignore some fsck warnings under transfer.fsckobjects; some of them are annoyingly common when people pull old history from an existing project and try to push it back up. It's not indicative of a new bug in an implementation, but we have to live with the broken history forever (e.g., zero-padded modes in trees). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html