On Sun, Mar 24, 2013 at 3:23 PM, Jeff King <peff@xxxxxxxx> wrote: > On Sun, Mar 24, 2013 at 08:01:33PM +0100, Ævar Arnfjörð Bjarmason wrote: > >> On Sun, Mar 24, 2013 at 7:31 PM, Jeff King <peff@xxxxxxxx> wrote: >> > >> > I don't have details on the KDE corruption, or why it wasn't detected >> > (if it was one of the cases I mentioned above, or a more subtle issue). >> >> One thing worth mentioning is this part of the article: >> >> "Originally, mirrored clones were in fact not used, but non-mirrored >> clones on the anongits come with their own set of issues, and are more >> prone to getting stopped up by legitimate, authenticated force pushes, >> ref deletions, and so on – and if we set the refspec such that those >> are allowed through silently, we don’t gain much. " >> >> So the only reason they were even using --mirror was because they were >> running into those problems with fetching. With a normal fetch. We actually *wanted* things like force updates and ref deletions to propagate, because we have not just Gitolite's checks but our own checks on the servers, and wanted that to be considered the authenticated source. Besides just daily use and preventing cruft, we wanted to ensure that such actions propagated so that if a branch was removed because it contained personal information, accidental commits, or a security issue (for instance) that the branch was removed on the anongits too, within a timely fashion. > I think the --mirror thing is a red herring. It should not be changing > the transport used, and that is the part of git that is expected to > catch such corruption. > > But I haven't seen exactly what the corruption is, nor exactly what > commands they used to clone. I've invited the blog author to give more > details in this thread. The syncing was performed via a clone with git clone --mirror (and a git:// URL) and updates with git remote update. So I should mention that my experiments after the fact were using local paths, but with --no-hardlinks. If you're saying that the transport is where corruption is supposed to be caught, then it's possible that we shouldn't see corruption propagate on an initial mirror clone across git://, and that something else was responsible for the trouble we saw with the repositories that got cloned after-the-fact. But then I'd argue that this is non-obvious. In particular, when using --no-hardlinks, I wouldn't expect that behavior to be different with a straight path and with file://. Something else: apparently one of my statements prompted joeyh to think about potential issues with backing up live git repos (http://joeyh.name/blog/entry/difficulties_in_backing_up_live_git_repositories/). Looking at that post made me realize that, when we were doing our initial thinking about the system three years ago, we made an assumption that, in fact, taking a .tar.gz of a repo as it's in the process of being written to or garbage collected or repacked could be problematic. This isn't a totally baseless assumption, as I once had a git repository that I was in the process of updating when I had a sudden power outage that suffered corruption. (It could totally have been the filesystem, of course, although it was a journaled file system.) So, we decided to use Git's built-in capabilities of consistency checking to our advantage (with, as it turns out, a flaw in our implementation). But the question remains: are we wrong about thinking that rsyncing or tar.gz live repositories in the middle of being pushed to/gc'd/repacked could result in a bogus backup? Thanks, Jeff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html