On 2/8/2021 5:26 PM, Junio C Hamano wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > >> The users of pair_chunk() presumably are not ready to (or simply do >> not want to) process the data immediately by using read_chunk() with >> callback, but when they get ready to process the data, unlike >> read_chunk callbacks, they do not get to learn how much they ought >> to process---all they learn is the address of the beginning of the >> chunk. I do not see a way to write pair_chunk() users safely to >> guarantee that they do not overrun at the tail end of the chunk they >> are processing. > > I've read through v3 and found it mostly done, but the above > question still stands. I find it questionable why callers of > pair_chunk() only can learn where a chunk data begins, without > being able to learn how big the region of memory is. IOW, why > can we get away without doing something like this? The users > of pair_chunk() won't even know when they overrun the end of > the data the are given without something like this, no? I guess that the point is that if a caller wants to perform logic on the size, then they should use read_chunk() instead. We have some chunks that check the size is correct upon read, but most chunks do not do this (currently). In future series, additional protections could be added, and I would expect that to be done by converting callers of pair_chunk() into callers of read_chunk() with appropriate callback functions. Thanks, -Stolee