Ramsay Jones <ramsay@xxxxxxxxxxxxxxxxxxxx> writes: > It should be, see commit b97e911643 ("Support for large files > on 32bit systems.", 17-02-2007), where you can see that the > _FILE_OFFSET_BITS macro is set to 64. This asks <stdio.h> et.al., > to use the "Large File System" API and a 64-bit off_t. Correct. Roughly speaking, we should view off_t as size of things you can store in a file, and size_t as size of things you can have in core. When we allocate a region of memory, and then repeatedly fill it with some data and hand that memory to a helper function e.g. git_deflate(), each of these calls should expect to get data whose size can be representable by size_t, and if that is shorter than ulong which we currently use, we are artificially limiting our potential by using a type that is narrower than necessary. The result from these helper functions that are repeatedly called may be sent to a file as the output from the loop. If that logic in the outer loop wants to keep a tally of the total size of data they processed, that number may not be fit in size_t and instead may require off_t. One interesting question is which of these two types we should use for the size of objects Git uses. Most of the "interesting" operations done by Git require that the thing is in core as a whole before we can do anything (e.g. compare two such things to produce delta, have one in core and apply patch), so it is tempting that we deal with size_t, but at the lowest level to serve as a SCM, i.e. recording the state of a file at each version, we actually should be able to exceed the in-core limit---both "git add" of a huge file whose contents would not fit in-core and "git checkout" of a huge blob whose inflated contents would not fit in-core should (in theory, modulo bugs) be able to exercise the streaming interface to handle such case without holding everything in-core at once. So from that point of view, even size_t may not be the "correct" type to use.