On Thu, Jan 09, 2025 at 03:32:36AM -0500, Jeff King wrote: > The combine_diff_path struct has variable size, since it embeds both the > memory allocation for the path field as well as a variable-sized parent > array. This makes allocating one a bit tricky. > > We have a helper to compute the required size, but it's up to individual > sites to actually initialize all of the fields. Let's provide a > constructor function to make that a little nicer. Besides being shorter, > it also hides away tricky bits like the computation of the "path" > pointer (which is right after the "parent" flex array). > > As a bonus, using the same constructor everywhere means that we'll > consistently initialize all parts of the struct. A few code paths left > the parent array unitialized. This didn't cause any bugs, but we'll be > able to simplify some code in the next few patches knowing that the > parent fields have all been zero'd. > > This also gets rid of some questionable uses of "int" to store buffer > lengths. Though we do use them to allocate, I don't think there are any > integer overflow vulnerabilities here (the allocation helper promotes > them to size_t and checks arithmetic for overflow, and the actual memcpy > of the bytes is done using the possibly-truncated "int" value). > > Sadly we can't use the FLEX_* macros to simplify the allocation here, > because there are two variable-sized parts to the struct (and those > macros only handle one). > > Nor can we get stop publicly declaring combine_diff_path_size(). This s/we get stop/we stop/ > diff --git a/combine-diff.c b/combine-diff.c > index 641bc92dbd..45548fd438 100644 > --- a/combine-diff.c > +++ b/combine-diff.c > @@ -47,22 +47,13 @@ static struct combine_diff_path *intersect_paths( > > if (!n) { > for (i = 0; i < q->nr; i++) { > - int len; > - const char *path; > if (diff_unmodified_pair(q->queue[i])) > continue; > - path = q->queue[i]->two->path; > - len = strlen(path); > - p = xmalloc(combine_diff_path_size(num_parent, len)); > - p->path = (char *) &(p->parent[num_parent]); > - memcpy(p->path, path, len); > - p->path[len] = 0; > - p->next = NULL; > - memset(p->parent, 0, > - sizeof(p->parent[0]) * num_parent); > - > - oidcpy(&p->oid, &q->queue[i]->two->oid); > - p->mode = q->queue[i]->two->mode; > + p = combine_diff_path_new(q->queue[i]->two->path, > + strlen(q->queue[i]->two->path), > + q->queue[i]->two->mode, > + &q->queue[i]->two->oid, > + num_parent); > oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid); > p->parent[n].mode = q->queue[i]->one->mode; > p->parent[n].status = q->queue[i]->status; > @@ -1667,3 +1658,24 @@ void diff_tree_combined_merge(const struct commit *commit, > diff_tree_combined(&commit->object.oid, &parents, rev); > oid_array_clear(&parents); > } > + > +struct combine_diff_path *combine_diff_path_new(const char *path, > + size_t path_len, > + unsigned int mode, > + const struct object_id *oid, > + size_t num_parents) > +{ > + struct combine_diff_path *p; > + > + p = xmalloc(combine_diff_path_size(num_parents, path_len)); > + p->path = (char *)&(p->parent[num_parents]); > + memcpy(p->path, path, path_len); > + p->path[path_len] = 0; > + p->next = NULL; > + p->mode = mode; > + oidcpy(&p->oid, oid); > + > + memset(p->parent, 0, sizeof(p->parent[0]) * num_parents); > + > + return p; > +} If I were to write this anew I'd probably use `xcalloc()` instead of manually `memset()`ing parts of it to zero. But it's a faithful transplant of the code from `intersect_paths()`, so that's probably okay. Patrick