> On Tue, Jul 28, 2020 at 12:23:50PM -0700, Jonathan Tan wrote: > > > > For this particular case, with the performance and all, I agree that > > > the stupid and robust approach would be the best. > > > > I'm concerned that we will be painting ourselves into a corner here - I > > have been appreciating the richer interface that a C call provides us, > > compared to sub-processes where we have to communicate through a single > > input stream and a single output stream. For example, "wanted-refs" and > > [...] > > Yeah, that's a compelling reason. I'd have thought for this use case you > could just say "hey, make sure these objects exist", which doesn't > require a lot of communication. But often when I think things like that > and start coding, it turns out to be much more complicated. So I am > perfectly willing to believe you that it doesn't apply here. And even > if it did, you're right that we may run into other spots that do need to > pass back more information, but benefit from more lib-ified code that > doesn't die(). Thanks. Just to be clear, I can't think of any hard stops to implementing lazy fetch in a sub-process right now, but (as you said) while programming I could discover something, or we could need something in the future. > Just to play devil's advocate for a moment... > > > (Also, I think that debugging within a process is easier than debugging > > across processes, but that might not be a concern that other people > > share.) > > This is definitely true sometimes, but I think is sometimes the > opposite. When we push things out to a sub-process, then the interface > between the two processes has to be well-defined (e.g., writing results > to a file with a particular format). And that can make debugging easier, > because you can pick up from that intermediate state (munging it in the > middle, or even generating it from scratch for testing). Well, unless there is some sort of interactivity like in remote helpers :-) > Likewise, that can result in a more flexible and robust system from the > perspective of users. If we had invented "git log" first, we probably > wouldn't have "git rev-list | git diff-tree --stdin" at all. But having > that as two separate tools is sometimes useful for people doing things > _besides_ log, since it gives different entry points to the code. That's true. And the lazy fetch might be one of them - after looking at the code, I think that I can get to where I want just by implementing a null negotiator, which will be useful for end users. (In particular, simulating a lazy fetch might be useful sometimes, and re-fetching a packfile could be a crude way of repairing object corruption.) > That said, I think I could buy the argument that "fetch" works pretty > well as a basic building block for users. It's pretty rare to actually > use fetch-pack as a distinct operation. This is all a monolith vs module > tradeoff question, and the tradeoff around modularity for fetch isn't > that compelling. If we are going the sub-process route, I was planning to use "fetch" as the sub-process, actually, not "fetch-pack" - among other things, "fetch" allows us to specify a fetch negotiator. So it seems like you are saying that if we had to use "fetch-pack", you have no problem with libifying it and calling it in the same process, but if we can use "fetch", we should use it as a sub-process? In any case, I'll look into using "fetch" as a sub-process and see if it works.