Lars Schneider <larsxschneider@xxxxxxxxx> wrote: > Hi, > > Git always performs a clean/smudge filter on files in sequential order. > Sometimes a filter operation can take a noticeable amount of time. > This blocks the entire Git process. I have the same problem in many places which aren't git :> > I would like to give a filter process the possibility to answer Git with > "I got your request, I am processing it, ask me for the result later!". > > I see the following way to realize this: > > In unpack-trees.c:check_updates() [1] we loop through the cache > entries and "ask me later" could be an acceptable return value of the > checkout_entry() call. The loop could run until all entries returned > success or error. > > The filter machinery is triggered in various other places in Git and > all places that want to support "ask me later" would need to be patched > accordingly. That all sounds reasonable. The filter itself would need to be aware of parallelism if it lives for multiple objects, right? > Do you think this could be a viable approach? It'd probably require a bit of work, but yes, I think it's viable. We already do this with curl_multi requests for parallel fetching from dumb HTTP servers, but that's driven by curl internals operating with a select/poll loop. Perhaps the curl API could be a good example for doing this. > Do you see a better way? Nope, I prefer non-blocking state machines to threads for debuggability and determinism. Anyways, I'll plan on doing something similar (in Perl) with the synchronous parts of public-inbox which relies on "cat-file --batch" at some point... (my rotational disks are sloooooooow :<)