Re: [PATCH 5/7] tmp-objdir: new API for creating and removing primary object dirs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 28 2021, Elijah Newren wrote:

> On Tue, Sep 28, 2021 at 1:00 AM Ævar Arnfjörð Bjarmason
> <avarab@xxxxxxxxx> wrote:
>>
>> On Tue, Aug 31 2021, Elijah Newren via GitGitGadget wrote:
>>
>> I commented just now on how this API is duplicated between here &
>> another in-flight series in
>> https://lore.kernel.org/git/87v92lxhh4.fsf@xxxxxxxxxxxxxxxxxxx/; Just
>> comments on this patch here:
>>
>> > diff --git a/tmp-objdir.c b/tmp-objdir.c
>> > index b8d880e3626..9f75a75d1c0 100644
>> > --- a/tmp-objdir.c
>> > +++ b/tmp-objdir.c
>> > @@ -288,7 +288,36 @@ const char **tmp_objdir_env(const struct tmp_objdir *t)
>> >       return t->env.v;
>> >  }
>> >
>> > +const char *tmp_objdir_path(struct tmp_objdir *t)
>> > +{
>> > +     return t->path.buf;
>> > +}
>>
>> Not your fault & this pre-dates your patch, but FWIW I prefer our APIs
>> that don't have these "hidden struct" shenanigans (like say "struct
>> string_list") so a caller could just access this, we can just declare it
>> "const" appropriately.
>>
>> We're also all in-tree friends here, so having various accessors for no
>> reason other than to access struct members seems a bit too much.
>
> That's a fair point, but just to play counterpoint for a minute...
>
> FWIW, I dislike when our public facing APIs are polluted with all
> kinds of internal details.  merge-recursive being a case in point.
> When writing merge-ort, although I had a struct with public fields,
> that struct also contained an opaque struct (pointer) within it to
> hide several private fields.  (I would have liked to hide or remove a
> few more fields, but couldn't do so while the merge_recursive_options
> struct was shared between both merge-ort and merge-recursive.)
>
> That said, I agree it can certainly be overdone, and tmp_objdir is
> pretty simple.  However, sometimes even in simple cases I like when
> folks make use of an opaque struct because it means folks put some
> effort into thinking more about the API that should be exposed.
> That's something we as a project have often overlooked in the past, as
> evidenced both by our one-shot usage mentality, and the existence of
> external projects like libgit2 attempting to correct this design
> shortcoming.  I'd like git to move more towards being structured as a
> reusable library as well as a useful command-line tool.

[This is somewhat of a continuation of
https://lore.kernel.org/git/87lf3etaih.fsf@xxxxxxxxxxxxxxxxxxx/].

I like our APIs being usable, but right now we're not promising any
maintenance of APIs that work the same as git v2.30.0 or whatever.

I think some external users directly link to libgit.a, but they're
tracking whatever churn we have from release to release, just like
someone maintaining out-of-tree linux kernel drivers.

For maintenance of git.git itself I think it's going to stay like that
for any foreseeable future[1].

And that's before we even get into the practicalities of git's license
likely not winning over many (if any) libgit2 users. I think those users
are if anything migrating from libgit2 to git's plumbing CLI, or in some
cases folding what they needed into git.git itself (or just staying with
libgit2).

So assuming all of that I really don't see the point in general of
having a "foo" field that has the equivalent of get_foo() and set_foo()
accessors in the context of git.git.

That's something that's really useful if you're maintaining API and ABI
compatibility, but for us it's usually just verbosity. All the users are
in-tree, so just add "const" as appropriate, or perhaps we should give
the field a "priv" name or whatever.

Obviously not everything should be done that way, e.g. being able to
have trace2 start/end points for something you'd otherwise push/clear
from a "struct string_list" or whatever directly *is* usable, so it's
not black and white.

I'm just saying thaht using convoluted patterns *just* to get in the way
of some hypothetical code that's going to mess with your internals can
make things harder in other areas. E.g. the init/release pattern via
macros I mentioned in the linked E-Mail above.

> [...]
>> In this particular case I hadn't understood on a previous reading of
>> tmp-objdir.c why this "path" isn't a statically allocated "char
>> path[PATH_MAX]", or a least we that hardcoded "1024" should be a
>> PATH_MAX, as it is in some other cases.
>
> Actually, PATH_MAX shouldn't be used:
>
> https://lore.kernel.org/git/7d1237c7-5a83-d766-7d93-5f0d59166067@xxxxxx/
> https://lore.kernel.org/git/20180720180427.GE22486@xxxxxxxxxxxxxxxxxxxxx/

Thanks. I didn't know about that. I've fixed (some of that) up in some
semi-related WIP work I've got.

1. Aside: I *do* think it would be really useful for us to expose a
   stable C API eventually, using some of the same license
   workarounds/shenanigans (depending on your view) as the Linux kernel
   did. I.e. to say that "we're GPLv2, but this small API is the
   equivalent of our syscall interface".

   Rather than that API being something that would look anything like
   internal git.git users it would basically be a run_command() + Perl's
   unpack/pack + output formatter on steroids.

   I.e. we'd be able to invoke say a 'git cat-file --batch' in the same
   process as the caller by essentially calling cmd_cat_file()
   directly. Just with some callback replacements for "provide this on
   stdin" and "this line accepts stdout" without actually involving
   different processes or I/O.

   You'd then have the equivalent of a --format output that would just
   so happen to be in your OS's / C ABI's understood struct
   format. Similar to "perldoc -f pack".

   The maintenance burden of such an API would be trivial. It would
   basically be a handful of functions just to setup input/output
   callbacks.

   Most of the interface would be a rather convoluted construction of
   command-line parameters (like our own run_command() use), not ideal,
   but most users are after structured data and eliminating process
   overhead, and they'd get that.

   We would need to replace fprintf() and the like with some structured
   formatting mechanism, but that would be useful for non-API users as
   new commands learning a shiny new --format parameter. You could even
   use such a --format to have a non-C-API user get the same struct
   output, perhaps for direct mapping into memory, say if you didn't
   trust the built-in/command in question to not leak memory yet.

   I don't know if this has been suggested before, just my 0.02. It's
   also partially why I'm interested in making even-built-ins not leak
   memory with some of my parallel SANITIZE=leak work.

   Once you've got that you can usually call the cmd_whatever() in a
   loop without much effort, and once you've got that, well, see above
   :)

   




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux