On Tue, Sep 28 2021, Elijah Newren wrote: > On Tue, Sep 28, 2021 at 1:00 AM Ævar Arnfjörð Bjarmason > <avarab@xxxxxxxxx> wrote: >> >> On Tue, Aug 31 2021, Elijah Newren via GitGitGadget wrote: >> >> I commented just now on how this API is duplicated between here & >> another in-flight series in >> https://lore.kernel.org/git/87v92lxhh4.fsf@xxxxxxxxxxxxxxxxxxx/; Just >> comments on this patch here: >> >> > diff --git a/tmp-objdir.c b/tmp-objdir.c >> > index b8d880e3626..9f75a75d1c0 100644 >> > --- a/tmp-objdir.c >> > +++ b/tmp-objdir.c >> > @@ -288,7 +288,36 @@ const char **tmp_objdir_env(const struct tmp_objdir *t) >> > return t->env.v; >> > } >> > >> > +const char *tmp_objdir_path(struct tmp_objdir *t) >> > +{ >> > + return t->path.buf; >> > +} >> >> Not your fault & this pre-dates your patch, but FWIW I prefer our APIs >> that don't have these "hidden struct" shenanigans (like say "struct >> string_list") so a caller could just access this, we can just declare it >> "const" appropriately. >> >> We're also all in-tree friends here, so having various accessors for no >> reason other than to access struct members seems a bit too much. > > That's a fair point, but just to play counterpoint for a minute... > > FWIW, I dislike when our public facing APIs are polluted with all > kinds of internal details. merge-recursive being a case in point. > When writing merge-ort, although I had a struct with public fields, > that struct also contained an opaque struct (pointer) within it to > hide several private fields. (I would have liked to hide or remove a > few more fields, but couldn't do so while the merge_recursive_options > struct was shared between both merge-ort and merge-recursive.) > > That said, I agree it can certainly be overdone, and tmp_objdir is > pretty simple. However, sometimes even in simple cases I like when > folks make use of an opaque struct because it means folks put some > effort into thinking more about the API that should be exposed. > That's something we as a project have often overlooked in the past, as > evidenced both by our one-shot usage mentality, and the existence of > external projects like libgit2 attempting to correct this design > shortcoming. I'd like git to move more towards being structured as a > reusable library as well as a useful command-line tool. [This is somewhat of a continuation of https://lore.kernel.org/git/87lf3etaih.fsf@xxxxxxxxxxxxxxxxxxx/]. I like our APIs being usable, but right now we're not promising any maintenance of APIs that work the same as git v2.30.0 or whatever. I think some external users directly link to libgit.a, but they're tracking whatever churn we have from release to release, just like someone maintaining out-of-tree linux kernel drivers. For maintenance of git.git itself I think it's going to stay like that for any foreseeable future[1]. And that's before we even get into the practicalities of git's license likely not winning over many (if any) libgit2 users. I think those users are if anything migrating from libgit2 to git's plumbing CLI, or in some cases folding what they needed into git.git itself (or just staying with libgit2). So assuming all of that I really don't see the point in general of having a "foo" field that has the equivalent of get_foo() and set_foo() accessors in the context of git.git. That's something that's really useful if you're maintaining API and ABI compatibility, but for us it's usually just verbosity. All the users are in-tree, so just add "const" as appropriate, or perhaps we should give the field a "priv" name or whatever. Obviously not everything should be done that way, e.g. being able to have trace2 start/end points for something you'd otherwise push/clear from a "struct string_list" or whatever directly *is* usable, so it's not black and white. I'm just saying thaht using convoluted patterns *just* to get in the way of some hypothetical code that's going to mess with your internals can make things harder in other areas. E.g. the init/release pattern via macros I mentioned in the linked E-Mail above. > [...] >> In this particular case I hadn't understood on a previous reading of >> tmp-objdir.c why this "path" isn't a statically allocated "char >> path[PATH_MAX]", or a least we that hardcoded "1024" should be a >> PATH_MAX, as it is in some other cases. > > Actually, PATH_MAX shouldn't be used: > > https://lore.kernel.org/git/7d1237c7-5a83-d766-7d93-5f0d59166067@xxxxxx/ > https://lore.kernel.org/git/20180720180427.GE22486@xxxxxxxxxxxxxxxxxxxxx/ Thanks. I didn't know about that. I've fixed (some of that) up in some semi-related WIP work I've got. 1. Aside: I *do* think it would be really useful for us to expose a stable C API eventually, using some of the same license workarounds/shenanigans (depending on your view) as the Linux kernel did. I.e. to say that "we're GPLv2, but this small API is the equivalent of our syscall interface". Rather than that API being something that would look anything like internal git.git users it would basically be a run_command() + Perl's unpack/pack + output formatter on steroids. I.e. we'd be able to invoke say a 'git cat-file --batch' in the same process as the caller by essentially calling cmd_cat_file() directly. Just with some callback replacements for "provide this on stdin" and "this line accepts stdout" without actually involving different processes or I/O. You'd then have the equivalent of a --format output that would just so happen to be in your OS's / C ABI's understood struct format. Similar to "perldoc -f pack". The maintenance burden of such an API would be trivial. It would basically be a handful of functions just to setup input/output callbacks. Most of the interface would be a rather convoluted construction of command-line parameters (like our own run_command() use), not ideal, but most users are after structured data and eliminating process overhead, and they'd get that. We would need to replace fprintf() and the like with some structured formatting mechanism, but that would be useful for non-API users as new commands learning a shiny new --format parameter. You could even use such a --format to have a non-C-API user get the same struct output, perhaps for direct mapping into memory, say if you didn't trust the built-in/command in question to not leak memory yet. I don't know if this has been suggested before, just my 0.02. It's also partially why I'm interested in making even-built-ins not leak memory with some of my parallel SANITIZE=leak work. Once you've got that you can usually call the cmd_whatever() in a loop without much effort, and once you've got that, well, see above :)