"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > if (nr == sizeof(f->buffer)) { > - /* process full buffer directly without copy */ > - data = buf; > + /* > + * Flush a full batch worth of data directly > + * from the input, skipping the memcpy() to > + * the hashfile's buffer. In this block, > + * f->offset is necessarily zero. > + */ What made me a bit confused was the fact that, in order to exercise the "bypass memcpy and take a full bufferful from the incoming data directly" optimization, there are two preconditions. The incoming data must be large enough, and we do not have anything kept in the buffer that needs to be emitted before the incoming data. And the cleverness of the original code was that both are checked by this single "nr == sizeof(f->buffer)" condition. So I do appreciate this extra comment, and I think future readers of the code will, too. > + the_hash_algo->update_fn(&f->ctx, buf, nr); > + flush(f, buf, nr); > } else { > - memcpy(f->buffer + offset, buf, nr); > - data = f->buffer; > + /* > + * Copy to the hashfile's buffer, flushing only > + * if it became full. > + */ > + memcpy(f->buffer + f->offset, buf, nr); > + f->offset += nr; > + left -= nr; > + if (!left) > + hashflush(f); > } > > count -= nr; > - offset += nr; > buf = (char *) buf + nr; > - left -= nr; > - if (!left) { > - the_hash_algo->update_fn(&f->ctx, data, offset); > - flush(f, data, offset); > - offset = 0; > - } > - f->offset = offset; > } > } > > > base-commit: 142430338477d9d1bb25be66267225fb58498d92