I think this is a great approach and one that I'd be happy to implement in LFS. The additional capability isn't too complex, so I think other similar filters to LFS shouldn't have a hard time implementing it either. I left a few comments, mostly expressing approval to the documentation changes. I'll leave the C review to someone more expert than me. +1 from me on the protocol changes. > +Delay > +^^^^^ > + > +If the filter supports the "delay" capability, then Git can send the > +flag "delay-able" after the filter command and pathname. Nit: I think either way is fine, but `can_delay` will save us 1 byte per each new checkout entry. > +"delay-id", a number that identifies the blob, and a flush packet. The > +filter acknowledges this number with a "success" status and a flush > +packet. I mentioned this in another thread, but I'd prefer, if possible, that we use the pathname as a unique identifier for referring back to a particular checkout entry. I think adding an additional identifier adds unnecessary complication to the protocol and introduces a forced mapping on the filter side from id to path. Both Git and the filter are going to have to keep these paths in memory somewhere, be that in-process, or on disk. That being said, I can see potential troubles with a large number of long paths that exceed the memory available to Git or the filter when stored in a hashmap/set. On Git's side, I think trading that for some CPU time might make sense. If Git were to SHA1 each path and store that in a hashmap, it would consume more CPU time, but less memory to store each path. Git and the filter could then exchange path names, and Git would simply SHA1 the pathname each time it needed to refer back to memory associated with that entry in a hashmap. > +by a "success" status that is also terminated with a flush packet. If > +no blobs for the delayed paths are available, yet, then the filter is > +expected to block the response until at least one blob becomes > +available. The filter can tell Git that it has no more delayed blobs > +by sending an empty list. I think the blocking nature of the `list_available_blobs` command is a great synchronization mechanism for telling the filter when it can and can't send new blob information to Git. I was tenatively thinking of suggesting to remove this command and instead allow the filter to send readied blobs in sequence after all unique checkout entries had been sent from Git to the filter. But I think this allows approach allows us more flexibility, and isn't that much extra complication or bytes across the pipe. -- Thanks, Taylor Blau