On Thu, Oct 19, 2023 at 01:28:51PM -0400, Taylor Blau wrote: > Continue to prepare for streaming an object's contents directly from > memory by teaching `bulk_checkin_source` how to perform reads and seeks > based on an address in memory. > > Unlike file descriptors, which manage their own offset internally, we > have to keep track of how many bytes we've read out of the buffer, and > make sure we don't read past the end of the buffer. > > Suggested-by: Junio C Hamano <gitster@xxxxxxxxx> > Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> > --- > bulk-checkin.c | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/bulk-checkin.c b/bulk-checkin.c > index 28bc8d5ab4..60361b3e2e 100644 > --- a/bulk-checkin.c > +++ b/bulk-checkin.c > @@ -141,11 +141,15 @@ static int already_written(struct bulk_checkin_packfile *state, struct object_id > } > > struct bulk_checkin_source { > - enum { SOURCE_FILE } type; > + enum { SOURCE_FILE, SOURCE_INCORE } type; > > /* SOURCE_FILE fields */ > int fd; > > + /* SOURCE_INCORE fields */ > + const void *buf; > + size_t read; > + > /* common fields */ > size_t size; > const char *path; > @@ -157,6 +161,11 @@ static off_t bulk_checkin_source_seek_to(struct bulk_checkin_source *source, > switch (source->type) { > case SOURCE_FILE: > return lseek(source->fd, offset, SEEK_SET); > + case SOURCE_INCORE: > + if (!(0 <= offset && offset < source->size)) > + return (off_t)-1; > + source->read = offset; > + return source->read; > default: > BUG("unknown bulk-checkin source: %d", source->type); > } > @@ -168,6 +177,13 @@ static ssize_t bulk_checkin_source_read(struct bulk_checkin_source *source, > switch (source->type) { > case SOURCE_FILE: > return read_in_full(source->fd, buf, nr); > + case SOURCE_INCORE: > + assert(source->read <= source->size); Is there any guideline around when to use `assert()` vs `BUG()`? I think that this assertion here is quite critical, because when it does not hold we can end up performing out-of-bounds reads and writes. But as asserts are typically missing in non-debug builds, this safeguard would not do anything for our end users, right? Patrick > + if (nr > source->size - source->read) > + nr = source->size - source->read; > + memcpy(buf, (unsigned char *)source->buf + source->read, nr); > + source->read += nr; > + return nr; > default: > BUG("unknown bulk-checkin source: %d", source->type); > } > -- > 2.42.0.405.g86fe3250c2 >
Attachment:
signature.asc
Description: PGP signature