Jeff King <peff@xxxxxxxx> writes: > On Mon, Aug 25, 2014 at 11:35:45AM -0700, Junio C Hamano wrote: > >> Steffen Prohaska <prohaska@xxxxxx> writes: >> >> >> Couldn't we do that with an lseek (or even an mmap with offset 0)? That >> >> obviously would not work for non-file inputs, but I think we address >> >> that already in index_fd: we push non-seekable things off to index_pipe, >> >> where we spool them to memory. >> > >> > It could be handled that way, but we would be back to the original problem >> > that 32-bit git fails for large files. >> >> Correct, and you are making an incremental improvement so that such >> a large blob can be handled _when_ the filters can successfully >> munge it back and forth. If we fail due to out of memory when the >> filters cannot, that would be the same as without your improvement, >> so you are still making progress. > > I do not think my proposal makes anything worse than Steffen's patch. I think we are saying the same thing, but perhaps I didn't phrase it well. > I think the main argument against going further is just that it is not > worth the complexity. Tell people doing reduction filters they need to > use "required", and that accomplishes the same thing. > >> >> So it seems like the ideal strategy would be: >> >> >> >> 1. If it's seekable, try streaming. If not, fall back to lseek/mmap. >> >> >> >> 2. If it's not seekable and the filter is required, try streaming. We >> >> die anyway if we fail. >> >> Puzzled... Is it assumed that any content the filters tell us to >> use the contents from the db as-is by exiting with non-zero status >> will always be large not to fit in-core? For small contents, isn't >> this "ideal" strategy a regression? > > I am not sure what you mean by regression here. We will try to stream > more often, but I do not see that as a bad thing. I thought the proposed flow I was commenting on was - try streaming and die if the filter fails For an optional filter working on contents that would fit in core, we currently do - slurp in memory, filter it, use the original if the filter fails If we switched to 2., then... ahh, ok, I misread "is required" part. The "regression" does not apply to that case at all. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html