Hi Junio, On Fri, 29 Oct 2021, Junio C Hamano wrote: > "Matt Cooper via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > -void *read_blob_entry(const struct cache_entry *ce, unsigned long *size) > > +void *read_blob_entry(const struct cache_entry *ce, size_t *size) > > { > > enum object_type type; > > - void *blob_data = read_object_file(&ce->oid, &type, size); > > + unsigned long ul; > > + void *blob_data = read_object_file(&ce->oid, &type, &ul); > > > > + *size = ul; > > It is a bit curious place to draw the line; we want to make sure > that blob_entry can hold huge data, but in this step we do not mind > read_object_file() is not capable of going full 64-bit? Indeed that is the case. The consideration here is: how large of a patch series do I want to take this late in the cycle? Here is the call tree (for full details, look no further than https://github.com/git-for-windows/git/pull/3487#issuecomment-950727616): read_object_file() repo_read_object_file() read_object_file_extended() read_object() oid_object_info_extended() do_oid_object_info_extended() loose_object_info() parse_loose_header_extended() packed_object_info() cache_or_unpack_entry() unpack_entry() unpack_object_header() unpack_object_header_buffer() get_size_from_delta() get_delta_hdr_size() All three leaves have code that needs to be adjusted to use `size_t` instead of `unsigned long`, and all of the other functions in that call tree need to be adjusted for that. Some of the callers do not even pass an `unsigned long` pointer around, but instead a pointer to `struct object_info` (which, you guessed it, also has an `unsigned long` that should have been a `size_t` from the beginning). This is too big a change I am willing to work on, let alone accept, this late in the cycle. Sure, it would fix the scenario where clean/smudge filters are not even involved (read: `git add`ing a 5GB file _directly_). But the potential for bugs to hide! > I guess I'll see soon enough why by reading later steps. I can see > that for the purpose of making write_entry() aware of the size_t, > this is necessary at the minimum. I fear that you won't see more about this in the later steps ;-) Ciao, Dscho