Hi, As you might be aware, I'm working on a mercurial remote helper for git. The way it stores metadata for mercurial manifests abuses "commit" references in trees, which are normally used for submodules. Some operations in the helper use git diff-tree on those trees to find files faster than just using ls-tree on every commit would. Anyways, long story short, it turns out that a combination of everything mentioned in the subject of this email causes running git diff-tree -r --stdin with a list of 300k+ pairs of commits to take 10 minutes, when (after investigation) adding --ignore-submodules=dirty made it take 1 minute instead, for the exact same 3GB output. It turns out, this all starts in is_submodule_ignored(), which contains: if (!DIFF_OPT_TST(options, OVERRIDE_SUBMODULE_CONFIG)) set_diffopt_flags_from_submodule_config(options, path); And set_diffopt_flags_from_submodule_config calls: submodule_from_path(null_sha1, path); And because there is no actual submodule involved, at some point that null_sha1 ends up in the call to read_sha1_file from submodule-config.c's config_from, which then proceeds to try to open the null sha1 as a loose object in every alternate, doing multiple system calls in each directory for something that is bound to fail. And to add pain to injury, it repeats that for each and every line of input to git diff-tree because the object cache doesn't care about storing negatives (which makes perfect sense for most cases). Even worse, when read_object returns NULL because the object doesn't exist, read_sha1_file_extended calls has_loose_object which does another set of system calls. Now, while I realize my use case is very atypical, and that I should just use --ignore-submodule=dirty, the fact that using the null sha1 can trigger such behavior strikes me as a footgun that would be better avoided. Especially when you factor the fact that read_sha1_file_extended calls lookup_replace_object_extended, which suggests one might interfere by creating a replace object for the null sha1. (BTW, it's not entirely clear to me, in the context of actual submodules, what the various --ignore-submodule options are supposed to mean for trees that are not the current HEAD ; also, the manual page say "all" is the default, but that doesn't appear to be true) >From a cursory look at the output of `git grep \\bnull_sha1` it doesn't look like the null sha1 is used anywhere else in a similar fashion where it can be attempted to be read as an object. So, one could consider this is something the submodule config code should handle on its own by treating the null_sha1 argument to submodule_from_path (really config_from) specially. After all, gitmodule_sha1_from_commit already avoids a get_sha1() call when it's given the null sha1. OTOH, it seems submodule_from_path and submodule_from_name, the only two public functions that end up in config_from(), are *always* called with either the null sha1 or a literal null pointer. The *only* calls to these functions that doesn't involve a null sha1 or a null pointer is from test code. So all in all, I'm not entirely sure what this sha1 argument is all about in the first place. However, an argument could be made that null_sha1 should be treated specially at a lower level (read_sha1_file, I guess). What would be sensible to do here? Mike