"Victoria Dye via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Victoria Dye <vdye@xxxxxxxxxx> > > Allow users to specify a single "tree-ish" value as a positional argument. > If provided, the contents of the given tree serve as the basis for the new > tree (or trees, in --batch mode) created by 'mktree', on top of which all of > the stdin-provided tree entries are applied. > > At a high level, the entries are "applied" to a base tree by iterating > through the base tree using 'read_tree' in parallel with iterating through > the sorted & deduplicated stdin entries via their iterator. That is, for > each call to the 'build_index_from_tree callback of 'read_tree': > > * If the iterator entry precedes the base tree entry, add it to the in-core > index, increment the iterator, and repeat. "add it" -> "add the base tree entry"? The next bullet point explicitly says it adds "the iterator entry", which makes it crystal clear what is going on. > * If the iterator entry has the same name as the base tree entry, add the > iterator entry to the index, increment the iterator, and return from the > callback to continue the 'read_tree' iteration. > * If the iterator entry follows the base tree entry, first check > 'df_name_hash' to ensure we won't be adding an entry with the same name > later (with a different mode). If there's no directory/file conflict, add > the base tree entry to the index. In either case, return from the callback > to continue the 'read_tree' iteration. IOW, we take advantage of the fact that iteration over the base tree and iteration over the sorted-and-deduped entries from the standard input are already sorted, and do a simple bog-standard "merge" of two lists? We'd probably have many common pitfalls to avoid with the read-tree walking the index and tree(s) in parallel (I still remember the pain of maintaining the cache_bottom for the side that walks the index). Makes me wonder if this opens a way to a future where somehow read-tree also shares code with this new code in mktree (or vice versa). > Finally, once 'read_tree' is complete, add the remaining entries in the > iterator to the index and write out the index as a tree. Or vice versa? We may finish iterating over the entries read from the standard input but there still are entries from the base tree side remaining, which would need to be added to complete the index, right? > +<tree-ish>:: > + If provided, the tree entries provided in stdin are added to this > + tree rather than a new empty one, replacing existing entries with > + identical names. Not compatible with `--literally`. "replacing" might need a bit more clarification when we start reading paths with multiple pathname components concatenated with slashes. In the base tree, we may have 100644 blob 536e55524db72bd2acf175208aef4f3dfc148d42 D and it can (indirectly) replaced by the standard input stream feeding entries like these 100644 blob b0517166ae2ad92f3b17638cbdee0f04b8170d99 D/a 100644 blob 495a54bc1397e2fd3177c2733baf4899b48d30bd D/b which also leads us to compute a tree entry 040000 tree eccdce44520aa3ef4ac5ba090df53eadb01229ef D/ in the top-level tree? The code looks good to me. Thanks.