On Wed, Jan 13, 2016 at 2:55 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Doug Kelly <dougk.ff7@xxxxxxxxx> writes: > >> Subject: Re: [PATCH 3/4] t5304: Ensure wanted files are not deleted > > I'd suggest s/wanted/non-garbage/. > I'm okay with this. >> Explicitly test for and ensure files that may be wanted are not >> deleted during a gc operation. These include .pack without .idx >> (which may be in-flight), garbage in the directory, and .keep files >> the user created. > > "garbage in the directory" is not well defined. "files in the > directory that clearly are not related to packing" is probably what > you meant, but the definition of "related to packing" is still > fuzzy. Please clarify. This is probably a good point. Perhaps a better way to think about it would be by rewording the paragraph to something like this: Explicitly test for and ensure files that may either be desired by the user or are possibly not garbage are not deleted during a gc operation. These include .pack files missing a corresponding .idx file (possibly due to it being in-flight), .keep files created by the user, and other unknown garbage in the pack directory. These files will still be identified by "git count-objects -v", but should not be removed automatically by gc. Only files we are absolutely sure are unnecessary will be removed as a part of the gc process. > > The following is me thinking aloud about things that you would need > to think about while attempting to clarify this. > > What should the code do if we find > > pack-b0a9d62a7471e58832a575a78d57f8fb26822125.frotz > > in $GIT_OBJECT_DIRECTORY/pack/ directory? Is it a "garbage in the > directory"? The filename looks so similar to the usual things with > know suffixes .pack, .idx, .bitmap, and .keep, that we may want to > consider that it might be another file related to the packing left > by a future version of Git and if we do not see corresponding .pack > we would want to remove it? Or do we want to do something else? > > What should "gc" do if we find > > pack-frotz.idx > > without corresponding ".pack"? Wouldn't it be safer to consider it > a garbage unrelated to packing (because regular packing would have > given it with 40-hex name, not "frotz") and leave it undeleted? > I think the above paragraph helps explain what we're doing and why. In your examples, a somewhat valid looking pack file with an unknown extension may be flagged as "garbage," but should not be deleted during the gc. Similarly, we decided that an .idx file with no corresponding .pack was safe to delete (since the pack is written before idx, and the initial performance problem was related to scanning a large number of idx files). I'm not saying there's nothing to be said for the difference in the base filename without extension. Currently, the logic to remove pack garbage doesn't look at that, though: it only considers the extension, and what related files are found in the directory. Whether this is good or bad, I'm not sure. It certainly does what I need at fairly low risk, though. Does this help clarify the situation more? > Thanks. > >> Signed-off-by: Doug Kelly <dougk.ff7@xxxxxxxxx> >> --- >> t/t5304-prune.sh | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh >> index 4fa6e7a..f7c380c 100755 >> --- a/t/t5304-prune.sh >> +++ b/t/t5304-prune.sh >> @@ -285,6 +285,23 @@ EOF >> test_cmp expected actual >> ' >> >> +test_expect_success 'ensure unknown garbage kept with gc' ' >> + test_when_finished "rm -f .git/objects/pack/fake*" && >> + test_when_finished "rm -f .git/objects/pack/foo*" && >> + : >.git/objects/pack/foo.keep && >> + : >.git/objects/pack/fake.pack && >> + : >.git/objects/pack/fake2.foo && >> + git gc && >> + git count-objects -v 2>stderr && >> + grep "^warning:" stderr | sort >actual && >> + cat >expected <<\EOF && >> +warning: garbage found: .git/objects/pack/fake2.foo >> +warning: no corresponding .idx or .pack: .git/objects/pack/foo.keep >> +warning: no corresponding .idx: .git/objects/pack/fake.pack >> +EOF >> + test_cmp expected actual >> +' >> + >> test_expect_success 'prune .git/shallow' ' >> SHA1=`echo hi|git commit-tree HEAD^{tree}` && >> echo $SHA1 >.git/shallow && -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html