Re: [PATCH v2 12/17] chunk-format: create read chunk API

Derrick Stolee <stolee@xxxxxxxxx> · Mon, 8 Feb 2021 20:33:29 -0500

On 2/8/2021 5:26 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@xxxxxxxxx> writes:
> 
>> The users of pair_chunk() presumably are not ready to (or simply do
>> not want to) process the data immediately by using read_chunk() with
>> callback, but when they get ready to process the data, unlike
>> read_chunk callbacks, they do not get to learn how much they ought
>> to process---all they learn is the address of the beginning of the
>> chunk.  I do not see a way to write pair_chunk() users safely to
>> guarantee that they do not overrun at the tail end of the chunk they
>> are processing.
> 
> I've read through v3 and found it mostly done, but the above
> question still stands.  I find it questionable why callers of
> pair_chunk() only can learn where a chunk data begins, without
> being able to learn how big the region of memory is.  IOW, why
> can we get away without doing something like this?  The users
> of pair_chunk() won't even know when they overrun the end of
> the data the are given without something like this, no?

I guess that the point is that if a caller wants to perform
logic on the size, then they should use read_chunk() instead.
We have some chunks that check the size is correct upon read,
but most chunks do not do this (currently).

In future series, additional protections could be added, and
I would expect that to be done by converting callers of
pair_chunk() into callers of read_chunk() with appropriate
callback functions.

Thanks,
-Stolee