Shawn Pearce <spearce@xxxxxxxxxxx> writes: > The other problem here is the caller cannot access the written objects > until the pack is closed. That is one of the things that has made > fast-import difficult for git-svn to use, because git-svn expects the > object to be available immediately. I assume that within a single git > add or git update-index process we don't need to worry about this, so > its probably a non-issue. Yes, it is part of a possible issue to be addressed in the plan. I envisioned that the "API" I talked about in the NEEDSWORK you quoted would keep an open file descriptor to the "currently being built" packfile wrapped in a "struct packed_git", with an in-core index_data that is adjusted every time you add a straight-to-pack kind of object. Upon a "finalize" call, it would determines the final pack name, write the real pack .idx file out, and rename the "being built" packfile to the final name to make it available to the outside world. Within a single git process that approach would give access to the set of objects that are going straight to the pack. When it needs to spawn a git subprocess, it however would need to finalize the pack to give access to the new object, just like when fast-import flushes when asked to expose the marks. After all, this topic is about handling large binary files that would not fit in core at once (we do not support them now at all). It may not too bad to say we stuff one object per packfile and immediately close the packfile (which is what the use of fast-import by the POC patch does). Once the packfile is closed, the object in it is automatically available to the outside world, and it is just the matter of making a reprepare_packed_git() call to make it available to ourselves. When there are many such objects, as they would exceed bigfilethreashold, repacking them would just amount to copying the already compressed data literally (I haven't re-checked the code though) and the cost shouldn't be more than proportional to the size of the data. Expecting any system to do better than that is asking for moon and I am not willing to bend backwards to cater to such demands before running out of other better things to do ;-). So I am tempted to keep the "spawn an external fast-import" code at least for now, and give it a higher priority to make the other side (writing out the blob to a working tree) streamable. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html