On Thu, Jan 24, 2019 at 01:12:15PM -0800, Junio C Hamano wrote: > Joey Hess <id@xxxxxxxxxx> writes: > > > When a worktree file is larger than the available memory, and a clean > > filter is in use, this avoids mallocing a buffer the whole size of the > > file when reading from the clean filter, which caused commands like git > > status and git commit to OOM. > > > > Often in this situation the clean filter will produce a short identifier > > for the file, so such a large buffer is not needed. > > > > When the clean filter does output something around the same size as the > > worktree file, the buffer will need to be reallocated until it fits, > > starting at 8192 and doubling in size. Benchmarking indicates that > > reallocation is not a significant overhead for outputs up to a > > few MB in size. > > Problem description first, then solultion. "... this avoids ..." is > already talking about solution while forcing the readers to know > what the problem is. > > When a worktree file is ... filter is in use, we allocate a > buffer for the whole size of the file when reading from the > clean filter. This can force us to overallocate if the clean > filter is used to radically shrink a huge file and replace it > with a small token (e.g. git-annex or git-lfs) and lead to OOM > at the worst case. Reading from the filter and growing the > buffer as we go would avoid such an unnecessary OOM. > > When the clean filter does output ... > ... few MB in size. > > perhaps. Yeah, I agree that organization is nicer. Other than that, the patch looks good to me. -Peff