On Mon, Mar 8, 2021 at 7:07 AM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > This large series goes on top of my 6 patch series for > read_tree_recursive() as this one further refactors that function. See > https://lore.kernel.org/git/20210308022138.28166-1-avarab@xxxxxxxxx/ > for that series. > > I noticed that since 2014 or so we haven't been doing the fsck checks > for bad file modes in trees. This series fixes that. I plan to add > tests etc. for that in another follow-up series. > > I wanted to get this out for review sooner than later, particularly > since the fsck testing will probably get me down another refactoring > path (fsck testing in general in this area is pretty bad...). > > As noted in 30/30 it would have been way easier to simply do an > isolated fix for that bug by introducing some fsck-specific API for > raw tree reading. > > But I thought the bug was symptomatic of a wider problem in our > codebase. Namely that we pass around the tree's mode *a lot*. > > But almost everything that then deals with the mode doesn't per-se > care about the mode bits in the tree, but using them to map that mode > to a tree entry for one of of OBJ_{BLOB,TREE,COMMIT}. > > So this is a large refactoring of all users of the widely used > tree-walk.h API to "enum obj2ect_type", finally in 29/30 I rename the > field to a scary "raw_mode". > > At that point we have just ~30-50 grep hits left for "raw_mode" in the > codebase (depending on whether we count names in function parameters). > > Hopefully being in that state alleviates e.g. Elijah's concerns > expressed in > https://lore.kernel.org/git/CABPp-BEdu1PqV5W=FuL0f08iFhGzvzV8oSUybNj4eF0aAwTnAw@xxxxxxxxxxxxxx/ > I agree that doing the equivalent of 30/30 on top of master would be > way too scary, but once we're at 29/30 I think it's sane. > > I tested this in combination with his on-list series to add more > merge-ort testing: > https://lore.kernel.org/git/pull.973.git.git.1614905738.gitgitgadget@xxxxxxxxx/ > > I found a regression I'd caused in the merge-ort.c code with those > tests, fixed here. See the comment in merge-ort.c in 30/30. I'll start reading through this series, but two quick early notes: - I was worried about touching the tree-walk code message up performance. Since collect_rename_info() in merge-ort.c takes the most time (and it's mostly a tree-walk), I was worried this would regress performance for me. A couple runs of my mega-renames big rebase testcase suggests performance is not that different, so my fears look unfounded. (Note: I tested in combination with all my performance improvements, because otherwise tree-walking would just be in the noise of overall runtime.) - There's a textual conflict with this series and my sets of patch series that finish off the ort implementation[1], but it's just two lines that are pretty easy to resolve. [1] https://github.com/gitgitgadget/git/pulls?q=is%3Aopen+is%3Apr+author%3Anewren+Optimization+batch > Ævar Arnfjörð Bjarmason (30): > diff.c: remove redundant canon_mode() call > notes & match-trees: use name_entry's "pathlen" member > cache.h: add a comment to object_type() > tree-walk.h: add object_type member to name_entry > tree-walk.c: migrate to using new "object_type" field when possible > cache.h: have base_name_compare() take "is tree?", not "mode" > tree-walk.h users: switch object_type(...) to new .object_type > tree.h: format argument lists of read_tree_recursive() users > tree.h users: format argument lists in archive.c > archive: get rid of 'stage' parameter > tree.h API: make read_tree_fn_t take an "enum object_type" > tree-walk.h users: migrate "p->mode &&" pattern > tree-walk.h users: refactor chained "mode" if/else into switch > tree-walk.h users: migrate miscellaneous "mode" to "object_type" > merge-tree tests: test for the mode comparison in same_entry() > merge-ort: correct reference to test in 62fdec17a11 > fsck.c: switch on "object_type" in fsck_walk_tree() > tree-walk.h users: use temporary variable(s) for "mode" > tree-walk.h API: formatting changes for subsequent commit > tree-walk.h API: rename get_tree_entry() to get_tree_entry_mode() > tree-walk.h API users: use "tmp" for mode in shift_tree_by() > tree-walk.h API: Add get_tree_entry_type() > tree-walk.h API: add a get_tree_entry_path() function > tree-walk.h API: document and format tree_entry_extract() > tree-entry.h API: rename tree_entry_extract() to > tree_entry_extract_mode() > tree-walk.h API: add a tree_entry_extract_all() function > tree-walk.h API: add a tree_entry_extract_type() function > tree-walk.h API users: rename "struct name_entry"'s "mode" to > "raw_mode" > tree.h API users: rename read_tree_fn_t's "mode" to "raw_mode" > tree-walk.h API: move canon_mode() back out of decode_tree_entry() > > archive.c | 51 +++++++++++++----------- > blame.c | 9 +++-- > builtin/checkout.c | 7 +++- > builtin/fast-import.c | 8 ++-- > builtin/grep.c | 6 +-- > builtin/log.c | 7 ++-- > builtin/ls-files.c | 13 +++--- > builtin/ls-tree.c | 18 ++++----- > builtin/merge-tree.c | 32 +++++++++------ > builtin/mktree.c | 4 +- > builtin/pack-objects.c | 6 +-- > builtin/reflog.c | 3 +- > builtin/rm.c | 2 +- > builtin/update-index.c | 7 +++- > cache-tree.c | 2 +- > cache.h | 11 ++++-- > combine-diff.c | 8 ++-- > delta-islands.c | 2 +- > diff.c | 2 +- > fsck.c | 23 +++++------ > http-push.c | 6 ++- > line-log.c | 2 +- > list-objects.c | 20 +++++++--- > match-trees.c | 52 ++++++++++++------------ > merge-ort.c | 34 ++++++++++------ > merge-recursive.c | 33 ++++++++-------- > notes.c | 15 +++---- > object-name.c | 7 ++-- > pack-bitmap-write.c | 8 ++-- > read-cache.c | 16 ++++---- > revision.c | 12 ++++-- > t/t4300-merge-tree.sh | 44 +++++++++++++++++++++ > tree-diff.c | 44 ++++++++++++--------- > tree-walk.c | 89 +++++++++++++++++++++++++++++++----------- > tree-walk.h | 67 ++++++++++++++++++++++++++----- > tree.c | 19 +++++---- > tree.h | 5 ++- > unpack-trees.c | 30 ++++++++------ > walker.c | 22 ++++++----- > 39 files changed, 482 insertions(+), 264 deletions(-) > > -- > 2.31.0.rc0.126.g04f22c5b82 >