On Wed, Aug 30, 2017 at 2:50 PM, Ben Peart <peartben@xxxxxxxxx> wrote: > > > On 8/29/2017 11:43 AM, Christian Couder wrote: >> >> On Mon, Aug 28, 2017 at 8:59 PM, Ben Peart <peartben@xxxxxxxxx> wrote: >>> >>> >>> On 8/3/2017 5:19 AM, Christian Couder wrote: >>>> >>>> >>>> +Helpers >>>> +======= >>>> + >>>> +ODB helpers are commands that have to be registered using either the >>>> +"odb.<odbname>.subprocessCommand" or the "odb.<odbname>.scriptCommand" >>>> +config variables. >>>> + >>>> +Registering such a command tells Git that an external odb called >>>> +<odbname> exists and that the registered command should be used to >>>> +communicate with it. >>> >>> >>> What order are the odb handlers called? Are they called before or after >>> the >>> regular object store code for loose, pack and alternates? Is the order >>> configurable? >> >> >> For get_*_object instructions the regular code is called before the odb >> helpers. >> (So the odb helper code is at the end of stat_sha1_file() and of >> open_sha1_file() in sha1_file.c.) >> >> For put_*_object instructions the regular code is called after the odb >> helpers. >> (So the odb helper code is at the beginning of write_sha1_file() in >> sha1_file.c.) >> >> And no this order is not configurable, but of course it could be made >> configurable. >> >>>> + - 'get_direct <sha1>' >>>> + >>>> +This instruction is similar as the other 'get_*' instructions except >>>> +that no object should be sent from the helper to Git. Instead the >>>> +helper should directly write the requested object into a loose object >>>> +file in the ".git/objects" directory. >>>> + >>>> +After the helper has sent the "status=success" packet and the >>>> +following flush packet in process mode, or after it has exited in the >>>> +script mode, Git should lookup again for a loose object file with the >>>> +requested sha1. >>> >>> >>> When will git call get_direct vs one of the other get_* functions? >> >> >> It is called just before exiting when git cannot find an object. >> It is not exactly at the same place as other get_* instructions as I >> tried to reuse your code and as it looks like it makes it easier to >> retry the regular code after the odb helper code. >> >>> Could the >>> functionality of enabling a helper to populate objects into the regular >>> object store be provided by having a ODB helper that returned the object >>> data as requested by get_git_obj or get_raw_obj but also stored it in the >>> regular object store as a loose object (or pack file) for future calls? >> >> >> I am not sure I understand what you mean. >> If a helper returns the object data as requested by get_git_obj or >> get_raw_obj, then currently Git will itself store the object locally >> in its regular object store, so it is redundant for the helper to also >> store or try to store the object in the regular object store. >> > > Doesn't this mean that objects will "leak out" into the regular object store > as they are used? For example, at checkout, all objects in the requested > commit would be retrieved from the various object stores and if they came > from a "large blob" ODB handler, they would get retrieved from the ODB > handler and then written to the regular object store (presumably as a loose > object). From then on, the object would be retrieved from the regular > object store. > > This would seem to defeat the goal of enabling specialized object handlers > to handle large or other "unusual" objects that git normally doesn't deal > well with. Yeah, I agree that storing the objects in the regular object store should not be done in all the cases. There should be a way to control that.