Re: [PATCH v2 12/17] chunk-format: create read chunk API

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 08 Feb 2021 14:26:58 -0800

Junio C Hamano <gitster@xxxxxxxxx> writes:

> The users of pair_chunk() presumably are not ready to (or simply do
> not want to) process the data immediately by using read_chunk() with
> callback, but when they get ready to process the data, unlike
> read_chunk callbacks, they do not get to learn how much they ought
> to process---all they learn is the address of the beginning of the
> chunk.  I do not see a way to write pair_chunk() users safely to
> guarantee that they do not overrun at the tail end of the chunk they
> are processing.

I've read through v3 and found it mostly done, but the above
question still stands.  I find it questionable why callers of
pair_chunk() only can learn where a chunk data begins, without
being able to learn how big the region of memory is.  IOW, why
can we get away without doing something like this?  The users
of pair_chunk() won't even know when they overrun the end of
the data the are given without something like this, no?

Thanks.

+struct memory_region {
+	const unsigned char *p;
+	size_t sz;
+};
+
 static int pair_chunk_fn(const unsigned char *chunk_start,
                          size_t chunk_size,
                          void *data)
 {
-        const unsigned char **p = data;
-        *p = chunk_start;
+        struct memory_region *x = data;
+        x->p = chunk_start;
+        x->sz = chunk_size;
         return 0;
 }

 int pair_chunk(struct chunkfile *cf,
                uint32_t chunk_id,
-                const unsigned char **p)
+                const unsigned char **p,
+                size_t *sz)
 {
+        int ret;
+        struct memory_region x;
=        return read_chunk(cf, chunk_id, pair_chunk_fn, &x);
+        ret = read_chunk(cf, chunk_id, pair_chunk_fn, &x);
+        *p = x.p;
+        *sz = x.sz;
+        return ret;
 }