Derrick Stolee <stolee@xxxxxxxxx> writes: > But...is there something we could still do here? > > My confusion about flushing is mostly due to my error, but upon > reflection the loop is doing a lot of different things, but most of > the time we know which behavior we need at the start, in the middle, > and at the end: > > 1. Fill the existing buffer with the beginning of 'buf'. If the > hashfile's buffer is full, then flush. "But do not do this if f->buffer is empty, and we are writing out more than sizeof(f->buffer)." is missing, isn't it? > 2. Flush sizeof(f->buffer) chunks directly out of 'buf' as long as > possible. > > 3. Copy the remaining bytes out of 'buf' into the hashfile's buffer. It is debatable if the existing loop, which came mostly from Nico's a8032d12 (sha1write: don't copy full sized buffers, 2008-09-02), is too clever; I personally find it concise and readable enough, but my reading is tainted. If you use a couple of helpers to reduce the repeated "crc and hash" pattern in your variant, it may become easier to follow than the original, but I dunno. > Here is a rewrite that more explicitly follows this flow: > > void hashwrite(struct hashfile *f, const void *buf, unsigned int count) > { > const int full_buffer = sizeof(f->buffer); > unsigned left = full_buffer - f->offset; > unsigned nr = count > left ? left : count; > > /* > * Initially fill the buffer in a batch until it > * is full, then flush. > */ > if (f->do_crc) > f->crc32 = crc32(f->crc32, buf, nr); > > memcpy(f->buffer + f->offset, buf, nr); Here, if the f->buffer was empty, we end up memcpy a full bufferful unconditionally. Nico's original cleverly takes advantage of the fact that 'nr' would be the full buffer size only when the f->buffer was empty upon entry to the function and we have more byte than the size of the buffer to copy out directly from 'buf'. > f->offset += nr; > count -= nr; > buf = (char *) buf + nr; > > if (left == nr) > hashflush(f); > > /* > * After filling the hashfile's buffer and flushing, take > * batches of full_buffer bytes directly from the input > * buffer. > */ > while (count >= full_buffer) { > if (f->do_crc) > f->crc32 = crc32(f->crc32, buf, full_buffer); > > the_hash_algo->update_fn(&f->ctx, buf, full_buffer); > flush(f, buf, full_buffer); > > count -= full_buffer; > buf = (char *) buf + full_buffer; > } > > /* > * Capture any remaining bytes at the end of the input buffer > * into the hashfile's buffer. We do not need to flush because > * count is strictly less than full_buffer here. > */ > if (count) { > if (f->do_crc) > f->crc32 = crc32(f->crc32, buf, count); > > memcpy(f->buffer + f->offset, buf, count); > f->offset = count; > } > > if (f->base) > hashwrite(f->base, buf, count); > } > ... > So, I'm of two minds here: > > 1. This is embarassing. I wasted everyone's time for nothing. I can retract > this patch. > > 2. This is embarassing. I overstated the problem here. But we might be able > to eke out a tiny performance boost here. > > I'm open to either. I think we should default to dropping this patch unless > someone thinks the rewrite above is a better organization of the logic. (I > can then send a v2 including that version and an updated commit message.) 3. The current code around "if (nr == sizeof(f->buffer))" might be a bit too clever for readers who try to understand what is going on, and the whole "while" loop may deserve a comment based on what you wrote before your replacement implementation.