On Tue, Sep 13, 2016 at 4:32 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Stefan Beller <sbeller@xxxxxxxxxx> writes: > >> So would we rather want to keep the ecbdata around for each file pair and >> just reference that? I thought we deliberately want to avoid ecbdata, so maybe >> we rather want to have another struct that keeps path related information >> around (pointer to the blob and white space information). > > I would expect that there would be two structs, one per path > "struct buffered_patch" that has the per-path thing, and another per > line "struct buffered_patch_line" that describes what each line is, > and has a pointer to the former. > Heh, I was trying to come up with a clever thing to save that pointer, as we would need to have that pointer once per line, so in large patches that would save a bit of space, but probably I should not try to be too smart about it. So I'd split up the struct line_emission into the two proposed buffered_patch_line as well as buffered_patch. However the naming is a bit off than I would expect. Historically you had one patch per file, so it was natural to name a change of multiple files a "patchset" (c.f. a commit in Gerrit is called "patchset"/revision) Today as Git is quite successful, one "patch" is easily understood as the equivalent of one patch, i.e. what format-patch produced. So I'd prefer to go with buffer_filepair and buffer_line maybe?