I'm not very familiar with the merge machinery, but I'll attempt a review of at least the first 10 patches. > merged_info contains all relevant information for a non-conflicted > entry. conflict_info contains a merged_info, plus any additional > information about a conflict such as the higher orders stages involved > and the names of the paths those came from (handy once renames get > involved). If an entry remains conflicted, the merged_info portion of a > conflict_info will later be filled with whatever version of the file > should be placed in the working directory (e.g. an as-merged-as-possible > variation that contains conflict markers). I think that this information should be in the .c file. > diff --git a/merge-ort.c b/merge-ort.c > index b487901d3e..9d5ea0930d 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -17,6 +17,46 @@ > #include "cache.h" > #include "merge-ort.h" > > +#include "strmap.h" > + > +struct merge_options_internal { > + struct strmap paths; /* maps path -> (merged|conflict)_info */ > + struct strmap unmerged; /* maps path -> conflict_info */ > + const char *current_dir_name; > + int call_depth; > +}; Maybe document if the path is from the root of the directory or just the filename as it appears in a tree object? I would have expected "unmerged" to be a "strset", but I guess it's more convenient for it to be a map itself. Maybe just document it as "all mappings in paths wherein the value is a struct conflict_info". There seems to be 2 ways of referring to something that we couldn't merge - "conflicted" (or "having a conflict") and "unmerged". Should we stick to one of them? Also, looking ahead, I see that current_dir_name is used as a temporary variable in the recursive calls to collect_merge_info_callback(). I would prefer if current_dir_name went in the cbdata to that function instead, but if that's not possible, maybe document here that current_dir_name is only used in collect_merge_info_callback(), and temporarily at that. > +struct version_info { > + struct object_id oid; > + unsigned short mode; > +}; OK. > +struct merged_info { > + struct version_info result; > + unsigned is_null:1; > + unsigned clean:1; > + size_t basename_offset; > + /* > + * Containing directory name. Note that we assume directory_name is > + * constructed such that > + * strcmp(dir1_name, dir2_name) == 0 iff dir1_name == dir2_name, > + * i.e. string equality is equivalent to pointer equality. For this > + * to hold, we have to be careful setting directory_name. > + */ > + const char *directory_name; > +}; I'm not sure how most of the fields in this struct are to be used, but perhaps that will be clearer once I read the subsequent code. > +struct conflict_info { > + struct merged_info merged; > + struct version_info stages[3]; > + const char *pathnames[3]; Why would these be different across stages? (Rename detection?) > + unsigned df_conflict:1; OK. > + unsigned path_conflict:1; This doesn't seem to be assigned anywhere in this patch set? > + unsigned filemask:3; > + unsigned dirmask:3; I wonder if this needs to be documented that the least significant bit corresponds to stages[0], and so forth. > + unsigned match_mask:3; I think this can be derived by just looking at the stages array? Maybe document as: Optimization to track which stages match. Either 0 or at least 2 bits are set; if at least 2 bits are set, their corresponding stages match. > +};