On Tue, Mar 23 2021, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > > This table is helpful for discovering data in the index to ensure it is > being written correctly, especially as we build and test the > sparse-index. This table includes an output format similar to 'git > ls-tree', but should not be compared to that directly. The biggest > reasons are that 'git ls-tree' includes a tree entry for every > subdirectory, even those that would not appear as a sparse directory in > a sparse-index. Further, 'git ls-tree' does not use a trailing directory > separator for its tree rows. > > This does not print the stat() information for the blobs. That will be > added in a future change with another option. The tests that are added > in the next few changes care only about the object types and IDs. > However, this future need for full index information justifies the need > for this test helper over extending a user-facing feature, such as 'git > ls-files'. Is that stat() information that's going to be essential to grab in the same process that runs the "for (i = 0; i < istate->cache_nr; i++)" for-loop, or stat() information that could be grabbed as: git ls-files -z --stage | some-program-that-stats-all-listed-blobs It's not so much that I still disagree as I feel like I'm missing something. I haven't gone through this topic with a fine toothed comb, so ... If and when these patches land and I'm using this nascent sparse checkout support why wouldn't I want ls-files or another not-a-test-tool to support extracting this new information that's in the index? That's why I sent the RFC patches at https://lore.kernel.org/git/20210317132814.30175-2-avarab@xxxxxxxxx/ to roll this functionality into ls-files. Still, I think if there's a good reason for why we want this in the index but never want our plumbing to be able to dump it in some user-facing way I think just as a matter of reviewing this code it would be much simpler if it was in ls-files behind some git_env_bool("GIT_TEST_...") flag or something. Or maybe I'm the only one who spends a lot of time with both ls-files.c and test-read-cache.c open while trying to review this trying to keep track of if and how this helper is and isn't subtly different from ls-files (as my RFC series shows, not really that different at all...). Especially with the really-just-ls-files-plus-one-thing tool mimicking ls-tree output, for reasons I still don't get... > To make the option parsing slightly more robust, wrap the string > comparisons in a loop adapted from test-dir-iterator.c. > > Care must be taken with the final check for the 'cnt' variable. We > continue the expectation that the numerical value is the final argument. I think even if you're set on not having this exposed in some builtin/*.c command this code would be much clearer based on some version of my https://lore.kernel.org/git/20210317132814.30175-6-avarab@xxxxxxxxx/ i.e. the part that isn't entirely deleting t/helper/test-read-cache.c, which would survive as t/helper/test-read-cache-sparse.c or something. As that patch shows this code is needlessly convoluted because it's serving 3x wildly different in-tree use-cases. I don't see how the very small amount of de-duplication we're getting is worth the complexity. At that point we don't need any care with the cnt variable, because we're not combining the fsmonitor and perf use-cases of reading the index in some loop with the ls-files-alike. > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > --- > t/helper/test-read-cache.c | 55 +++++++++++++++++++++++++++++++------- > 1 file changed, 45 insertions(+), 10 deletions(-) > > diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c > index 244977a29bdf..6cfd8f2de71c 100644 > --- a/t/helper/test-read-cache.c > +++ b/t/helper/test-read-cache.c > @@ -1,36 +1,71 @@ > #include "test-tool.h" > #include "cache.h" > #include "config.h" > +#include "blob.h" > +#include "commit.h" > +#include "tree.h" > + > +static void print_cache_entry(struct cache_entry *ce) > +{ > + const char *type; > + printf("%06o ", ce->ce_mode & 0177777); > + > + if (S_ISSPARSEDIR(ce->ce_mode)) > + type = tree_type; > + else if (S_ISGITLINK(ce->ce_mode)) > + type = commit_type; > + else > + type = blob_type; > + > + printf("%s %s\t%s\n", > + type, > + oid_to_hex(&ce->oid), > + ce->name); > +} > + > +static void print_cache(struct index_state *istate) > +{ > + int i; > + for (i = 0; i < istate->cache_nr; i++) > + print_cache_entry(istate->cache[i]); > +} > > int cmd__read_cache(int argc, const char **argv) > { > + struct repository *r = the_repository; > int i, cnt = 1; > const char *name = NULL; > + int table = 0; > > - if (argc > 1 && skip_prefix(argv[1], "--print-and-refresh=", &name)) { > - argc--; > - argv++; > + for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { > + if (skip_prefix(*argv, "--print-and-refresh=", &name)) > + continue; > + if (!strcmp(*argv, "--table")) > + table = 1; > } > > - if (argc == 2) > - cnt = strtol(argv[1], NULL, 0); > + if (argc == 1) > + cnt = strtol(argv[0], NULL, 0); > setup_git_directory(); > git_config(git_default_config, NULL); > + > for (i = 0; i < cnt; i++) { > - read_cache(); > + repo_read_index(r); > if (name) { > int pos; > > - refresh_index(&the_index, REFRESH_QUIET, > + refresh_index(r->index, REFRESH_QUIET, > NULL, NULL, NULL); > - pos = index_name_pos(&the_index, name, strlen(name)); > + pos = index_name_pos(r->index, name, strlen(name)); > if (pos < 0) > die("%s not in index", name); > printf("%s is%s up to date\n", name, > - ce_uptodate(the_index.cache[pos]) ? "" : " not"); > + ce_uptodate(r->index->cache[pos]) ? "" : " not"); > write_file(name, "%d\n", i); > } > - discard_cache(); > + if (table) > + print_cache(r->index); > + discard_index(r->index); > } > return 0; > }