> On 02 Jun 2017, at 04:21, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Lars Schneider <larsxschneider@xxxxxxxxx> writes: > >> ... >> +After Git received the pathnames, it will request the corresponding >> +blobs again. These requests contain a pathname and an empty content >> +section. The filter is expected to respond with the smudged content >> +in the usual way as explained above. >> ... > > Random things that come to mind (which I suspect has been dealt with > in code but not described in the above in detail): > > * Using pathname as a "delay key" would mean that we assume that a > single helper process is spawned to serve only a single Git > invocation (as opposed to be sitting as a daemon accepting > requests from instances of Git invoked much later than the daemon > started), which is OK, but it is somewhat unclear when a filter > is allowed to discard the requests that used to be outstanding > from the write-up. Imagine Git asked for A and B, both of which > was delayed, and then asked for a list of available ones and > learned that A is now ready. > > - Is it an error to ask for B at this point, and if so how is the > error handled? Or will it turn into a blocking operation? It is no error per se and it should just work. If CE_DELAYED is set then the filter could even delay the blob. However, I would consider this bug in Git that would need to be fixed. Would this make the documentation clear enough? After Git received the pathnames, it will request the corresponding blobs again (and only these blobs). > - As A is available, Git can ask for it. After retrieving > smudged content for A, can Git ask for it again? Or is it an > error to do so, without starting over with another can-delay=1 > request for the same path? I'd consider it a bug if Git asks for a blob again. How about this? The filter is expected to respond with the smudged content in the usual way as explained above. The filter can free any resources associated with the blob after Git acknowledged the delivery. Git will not ask for the blob again. > * The pathname does not have to be encoded/quoted in any way as it > is just a payload on the tail end of a packet line, so I am > guessing it is just a sequence raw bytes. It is somewhat unclear > in the above write-up. Also, is this considered to be a part of > "textual" exchange over the packet-line protocol, where it is > customary to remove the trailing LF from the packet? "We do not > support a file whose name ends with LF" may or may not be an > acceptable stance (I am personally OK, but some people who use > Git may not), but it needs to be documented if there is such an > issue. How about this? If the filter supports the "delay" capability then it must support the "list_available_blobs" command. If Git sends this command, then the filter is expected to return a list of pathnames of blobs that are available. The pathname packets send by the filter must exactly match the pathname packets that Git requested initially. Maybe even "must match bit-exact"? > * The continued request throws an empty content in the above > illustration. Is a filter allowed/encouraged to assume that an > empty content in the request is a continuation? I am guessing > that the answer is NO (otherwise you cannot filter an empty > file), and it somehow need to be documented, perhaps? Correct, the answer is NO. Would "irrelevant content section" be OK or should I try to find a more explicit way to document that? After Git received the pathnames, it will request the corresponding blobs again. These requests contain a pathname and an irrelevant (usually empty) content section. The filter is expected to respond with the smudged content in the usual way as explained above. Thanks a lot for the review, Lars