On Wed, 11 Apr 2007, Linus Torvalds wrote:
On Thu, 12 Apr 2007, Sam Vilain wrote: The reason a *full* global fsck is so expensive is that it would have an absolutely humungous working set, and effectively keep everything in memory through it all. Doing it in stages ("fsck smaller individiual trees separately") is actually the same amount of absolute work, but the working set never grows, so it scales much better. (fsck'ing projects individually also happens to allow you to do the sub-project fsck's in parallel across multiple CPU's or multiple machines, so it actually scales much better that way too - but the big problem tends to be excessive memory use, so the "SMP parallel version" only makes sense if you have tons of memory and can afford to do these things at the same time!)
would it make sense to have a --multiple-project option for fsck that would let you specify multiple 'projects' that share a object set and have the default checking not do the reachability checks that cause problems in this case?
Then people can share the objects if they want to and still do a full check, but would get warned that the full check would take a lot of time. which is not a big problem for a housekeeping thing that's run infrequently to find unreachable objects (which is something that should seldom happen in a well managed project)
David Lang - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html