On Wed, Feb 03 2021, Junio C Hamano wrote: > "=?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?=" Bjarmason <avarab@xxxxxxxxx> > writes: > >> But I was wondering about fast-import.c in particular. I think Elijah's >> patch here is obviously good an incremental improvement. But stepping >> back a bit: who cares about sort-of-fsck validation in fast-import.c >> anyway? > > Those who want to notice and verify the procedure they used to > produce the import data from the original before it is too late? > > I.e. data gets imported to Git, victory declared and then old SCM > turned gets off---and only then the resulting imported repository is > found not to pass fsck. > >> Shouldn't it just pretty much be importing data as-is, and then we could >> document "if you don't trust it, run fsck afterwards"? > > If it is a small import, the distinction does not matter, but for a > huge import, the procedure to produce the data is likely to be > mechanical, so even after processing just a very small portion of > early part of the datastream, systematic errors would be noticed > before fast-import wastes importing too much garbage that need to be > discarded after running such fsck. So in that sense, I suspect that > there is value in the early validation. What I was fishing for here is that perhaps since fast-import was originally written this use-case of in-place conversion of primary data on a server might have become too obscure to care about, i.e. as opposed to doing a conversion locally and then "git push"-ing it to something that does transfer.fsckObjects. >> Or, if it's a use-case people actually care about, then I might see >> about unifying some of these parser functions as part of a series I'm >> preparing. > > I think allowing people to loosen particular checks for fast-import > (or elsewhere for that matter) is a good idea, and you can do so > more easily once the existing checking is migrated to your new > scheme that shares code with the fsck machinery. ...allright, depending on how much of a hassle that is I might just add tests for the differences and leave this particular problem to someone else :)