Re: [PATCH 00/26] Moving global state into the repository object (part 1)

Duy Nguyen <pclouds@xxxxxxxxx> · Wed, 14 Feb 2018 07:57:06 +0700

On Wed, Feb 14, 2018 at 2:26 AM, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote:
>> Duy suggested that we shall not use the repository blindly, but should carefully
>> examine whether to pass on an object store or the refstore or such[4], which
>> I agree with if it makes sense. This series unfortunately has an issue with that
>> as I would not want to pass down the `ignore_env` flag separately from the object
>> store, so I made all functions that only take the object store to have the raw
>> object store as the first parameter, and others using the full repository.
>
> I think I want to push back on this a little.
>
> The advantage of a function taking e.g. an object_store as an argument
> instead of a repository is that it increases its flexibility, since it
> allows callers that do not have access to a repository to call it.  The
> disadvantage is also that it increases the flexibility without any
> callers benefitting from that:
>
>  1. It ties us to assumptions from today.  If e.g. an object access in
>     the future starts relying on some other information from the
>     repository (e.g. its config) then we'd have to either add a
>     back-pointer from the object store to its repository or add
>     additional arguments for that additional data at that point.
>
>     If all callers already have a repository, it is simpler to pass
>     that repository as context so that we have the flexibility to make
>     more use of it later.

It's essentially putting all global variables in the same place again.
Only this time it's not global namespace, but "struct repository".
It's no worse than the current state though.

>
>  2. It complicates the caller.  Instead of consistently passing the
>     same repository argument as context to functions that access that
>     repository, the caller would have to pull out relevant fields like
>     the object store from it.

Well, I see that as a good point :)

>
>  3. It prevents us from making opportunistic use of other information
>     from the repository, such as its name for use in error messages.

It does not exactly prevent us. It's just more effort to pass this
around. The repository name for example, there's no reason we can't
have object store name (which could be initialized the same as repo
name).

> In lower-level funcitons that need to be usable by callers without a
> repository (e.g. to find packfiles in an alternate) it makes sense to
> not pass a repository, but without such a use case in mind

I do agree with your benefit argument. But I'd like to point out
though that having all input to object store visible from something
like "struct raw_object_store" makes it easier to reason about the
code. I know how object store works, but occasionally I'm still
surprised when it getenv (or read $GIT_DIR/index, but not in object
store code) behind the scene. Imagine how hard it is for newcomers.

I would count that as benefit, even though it's not a use case per se.
Another potential benefit is writing unit tests will be much easier
(you can configure object store through struct repository too, but
setting one piece here, one piece there to control object store
behavior is not a nice experience). It's a nice thing to have, but not
a deciding factor.

> I don't think it needs to be a general goal.

My stand is a bit more aggressive. We should try to achieve better
abstraction if possible. But if it makes Stefan's life hell, it's not
worth doing. Converting to 'struct repository' is already a step
forward. Actually if it discourages him from finishing this work, it's
already not worth doing.
-- 
Duy