On Wed, Nov 4, 2020 at 9:34 PM Matheus Tavares <matheus.bernardino@xxxxxx> wrote: > > This new interface allows us to enqueue some of the entries being > checked out to later call write_entry() for them in parallel. For now, > the parallel checkout machinery is enabled by default and there is no > user configuration, but run_parallel_checkout() just writes the queued > entries in sequence (without spawning additional workers). The next > patch will actually implement the parallelism and, later, we will make > it configurable. I would think that it might be more logical to first add a configuration that does nothing, then add writing the queued entries in sequence without parallelism, and then add actual parallelism. > When there are path collisions among the entries being written (which > can happen e.g. with case-sensitive files in case-insensitive file > systems), the parallel checkout code detects the problem and marks the > item with PC_ITEM_COLLIDED. Is this needed in this step that only writes the queued entries in sequence without parallelism, or could this be added later, before the step that adds actual parallelism? > Later, these items are sequentially fed to > checkout_entry() again. This is similar to the way the sequential code > deals with collisions, overwriting the previously checked out entries > with the subsequent ones. The only difference is that, when we start > writing the entries in parallel, we won't be able to determine which of > the colliding entries will survive on disk (for the sequential > algorithm, it is always the last one). So I guess that PC_ITEM_COLLIDED will then be used to decide which entries will not be checked out in parallel? > I also experimented with the idea of not overwriting colliding entries, > and it seemed to work well in my simple tests. There are a number of co-author of this patch, so it's not very clear who "I" is. Maybe: "The idea of not overwriting colliding entries seemed to work well in simple tests, however ..." > However, because just one > entry of each colliding group would be actually written, the others > would have null lstat() fields on the index. This might not be a problem > by itself, but it could cause performance penalties for subsequent > commands that need to refresh the index: when the st_size value cached > is 0, read-cache.c:ie_modified() will go to the filesystem to see if the > contents match. As mentioned in the function: > > * Immediately after read-tree or update-index --cacheinfo, > * the length field is zero, as we have never even read the > * lstat(2) information once, and we cannot trust DATA_CHANGED > * returned by ie_match_stat() which in turn was returned by > * ce_match_stat_basic() to signal that the filesize of the > * blob changed. We have to actually go to the filesystem to > * see if the contents match, and if so, should answer "unchanged". > > So, if we have N entries in a colliding group and we decide to write and > lstat() only one of them, every subsequent git-status will have to read, > convert, and hash the written file N - 1 times, to check that the N - 1 > unwritten entries are dirty. By checking out all colliding entries (like > the sequential code does), we only pay the overhead once. Ok. > 5 files changed, 410 insertions(+), 3 deletions(-) It looks like a lot of new code in one patch/commit, which is why it might be interesting to split it. > @@ -7,6 +7,7 @@ > #include "progress.h" > #include "fsmonitor.h" > #include "entry.h" > +#include "parallel-checkout.h" > > static void create_directories(const char *path, int path_len, > const struct checkout *state) > @@ -426,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state, > for (i = 0; i < state->istate->cache_nr; i++) { > struct cache_entry *dup = state->istate->cache[i]; > > - if (dup == ce) > - break; > + if (dup == ce) { > + /* > + * Parallel checkout creates the files in no particular > + * order. So the other side of the collision may appear > + * after the given cache_entry in the array. > + */ Is it really the case right now that the code creates files in no particular order or will that be the case later when actual parallelism is implemented? > + if (parallel_checkout_status() == PC_RUNNING) > + continue; > + else > + break; > + } > +struct parallel_checkout_item { > + /* pointer to a istate->cache[] entry. Not owned by us. */ > + struct cache_entry *ce; > + struct conv_attrs ca; > + struct stat st; > + enum pc_item_status status; > +}; "item" seems not very clear to me. If there is only one parallel_checkout_item for each cache_entry then it might be better to use "parallel_checkout_entry" instead of "parallel_checkout_item". > +enum pc_status { > + PC_UNINITIALIZED = 0, > + PC_ACCEPTING_ENTRIES, > + PC_RUNNING, > +}; > + > +enum pc_status parallel_checkout_status(void); > +void init_parallel_checkout(void); Maybe a comment to tell what the above function does could be helpful. If I had to guess, I would write something like: /* * Put parallel checkout into the PC_ACCEPTING_ENTRIES state. * Should be used only when in the PC_UNINITIALIZED state. */ > +/* > + * Return -1 if parallel checkout is currently not enabled Is it "enabled" or "initialized" or "configured" here? Does it refer to `enum pc_status` or a config option or something else? Looking at the code, it is testing if the status PC_ACCEPTING_ENTRIES, so perhaps: s/not enabled/not accepting entries/ > or if the entry is > + * not eligible for parallel checkout. Otherwise, enqueue the entry for later > + * write and return 0. > + */ > +int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca);