On Wed, Jun 29, 2022 at 12:50 PM Dian Xu <dianxudev@xxxxxxxxx> wrote: > > Dear Git developers, > > Reporting Issue: > 'git add' hangs in a large repo which has > sparse-checkout file with large number of patterns in it > > Found in: > Git 2.34.3. Issue occurs after 'audit for interaction > with sparse-index' was introduced in add.c > > Reproduction steps: > 1. Clone a repo which has e.g. 2 million plus files > 2. Enable sparse checkout by: git config core.sparsecheckout true > 3. Create a .git/info/sparse-checkout file with a large > number of patterns, e.g. 16k plus lines Did you run `git read-tree -mu HEAD` or even `git sparse-checkout reapply` after step 3 and before step 4? If not, you've left the working tree out-of-sync with the specified sparsity paths and should fix that before running step 4. > 4. Run 'git add', which will hang Alternatively to the above, if you really want to add a file and ignore the fact that it might be outside the sparsity patterns (and risk it later randomly disappearing with checkout/rebase/merge/etc. commands), then you can use `git add --sparse $FILENAME`. > Investigations: > 1. Stack trace: > add.c: cmd_add > -> add.c: prune_directory > -> pathspec.c: add_pathspec_matches_against_index > -> dir.c: path_in_sparse_checkout_1 > 2. In Git 2.33.3, the loop at pathspec.c line 42 runs > fast, even when istate->cache_nr is at 2 million > 3. Since Git 2.34.3, the newly introduced 'audit for > interaction with sparse-index' (dir.c line 1459: > path_in_sparse_checkout_1) decides to loop through 2 million files and > match each one of them against the sparse-checkout patterns > 4. This hits the O(n^2) problem thus causes 'git add' to > hang (or ~1.5 hours to finish) > > Please help us take a look at this issue and let us know if you need > more information. I'm also curious if you can use --cone mode in sparse-checkout. The O(N*M) behavior of sparse checkouts in non-cone mode is pretty fundamental, and we may need to add additional paths checking the sparsity patterns (i.e. more O(N*M) codepaths) to fix various user-observed bugs. Usage of --cone mode drops all of these to a linear cost.