On Wed, Feb 13, 2013 at 12:23 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> writes: > >> +/* A hook for count-objects to report invalid files in pack directory */ >> +extern void (*report_garbage)(const char *desc, const char *path, int len, const char *name); > > We may want to document the strange way the last three parameters > are used somewhere. e.g. > > shows "path" (if "name" is NULL), or prepends "path" in > front of name (otherwise); only for the latter, "path" can > be a string that is not NUL-terminated but its length > specified with "len" and in that case a slash is inserted > between the path and the "name". > > When described clearly, it sounds somewhat ugly and incoherent API, > even though it covers the immediate need X-<. One of the reasons why I did not export it explicitly. Changed it to void (*report_garbage)(const char *desc, const char *path); and pushed the ugly part back to callers. > How about doing it something along this line, perhaps? > > int i; > int beginning_of_this_name = -1; > int seen_bits = 0; /* 01 for .idx, 02 for .pack */ > for (i = 0; i < list->nr; i++) { > if (beginning_of_this_name < 0) > beginning_of_this_name = i; > else if (list->items[i] and list->items[beginning_of_this_name] > share the same basename) > ; /* keep scanning */ > else { > /* one name ended at (i-1) */ > if (seen_bits == 3) > ; /* both .idx and .pack exist; good */ > else > report_garbage_for_one_name(list, beginning_of_this_name, i, > seen_bits); > seen_bits = 0; > beginning_of_this_name = i; > } > if (list->items[i] is ".idx") > seen_bits |= 1; > if (list->items[i] is ".pack") > seen_bits |= 2; > > } > if (0 <= beginning_of_this_name && seen_bits != 3) > report_garbages_for_one_name(list, beginning_of_this_name, list->nr, seen_bits); > > with a helper function report_garbage_for_one_name() that would look like this: > > report_garbage_for_one_name(...) { > int j; > const char *msg; > switch (seen_bits) { > case 0: msg = "no corresponding .idx nor .pack"; break; > case 1: msg = "no corresponding .pack"; break; > case 2: msg = "no corresponding .idx; break; > } > for (j = beginning_of_this_name; j < i; j++) > report_garbage(msg, list->items[j]); > } > > For the above to work, prepare_packed_git_one() needs to retain only the > paths with known extensions in garbage list. "pack-deadbeef.unk" can and > should be reported as a garbage immediately when it is seen without being > placed in the list. Yup. Looks good. >> + } else if (has_extension(de->d_name, ".idx")) { >> + struct string_list_item *item; >> + int n = strlen(path) - 4; >> + item = string_list_append_nodup(&garbage, >> + xstrndup(path, n)); >> + item->util = ".idx"; >> + continue; >> + } else >> + report_garbage("garbage found", path, 0, NULL); > > Hmm, where is a ".keep" file handled in this flow? Apparently I smoked/drank while coding or something. .idx is supposed to be .keep. This calls for a test to guard my code (part of this v4). > The structure of the if/else cascade is much nicer than the earlier > iterations, but wouldn't it be even more clear to do this? > > if (is .idx file) { > ... do that .idx thing ... > } > > if (!report_garbage) > continue; /* it does not matter what the file is */ > > if (is .pack) { > ... remember that we saw this .pack ... > } else if (is .idx) { > ... remember that we saw this .idx ... > } else if (is .keep) { > ... remember that we saw this .keep ... > } else { > ... all else --- report as garbage immediately ... > } Done. 2/4 is updated to make sure the "if (is .idx file)" block does not shortcut the loop with "continue;" so that we always get .idx file in the end of the loop. Nguyễn Thái Ngọc Duy (4): git-count-objects.txt: describe each line in -v output sha1_file: reorder code in prepare_packed_git_one() count-objects: report garbage files in pack directory too count-objects: report how much disk space taken by garbage files Documentation/git-count-objects.txt | 22 ++++++-- builtin/count-objects.c | 30 ++++++++--- cache.h | 3 ++ sha1_file.c | 101 +++++++++++++++++++++++++++++++----- t/t5304-prune.sh | 26 ++++++++++ 5 files changed, 156 insertions(+), 26 deletions(-) -- 1.8.1.2.536.gf441e6d -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html