On Fri, Feb 9, 2024 at 5:31 AM <rsbecker@xxxxxxxxxxxxx> wrote: > > On Thursday, February 8, 2024 9:30 PM, Kyle Lippincott wrote: > >While thinking about libification, I was wondering what we can/need to do about > >symbols (specifically functions, since our libraries will likely have few to no extern > >variables) that need to escape their translation unit (.c file) but that we don't want > >to risk colliding with symbols from our "host" project. > > > >For any header that we're offering up as an API boundary we can have prefixed > >names, but there are symbols from git-compat-util.h with simple and likely > >common names like `die`, `usage`, error`, etc. I'm far from an expert on linkers, but > >I'm under the impression that even though we'll only be #including git-compat- > >util.h in our own .c files (so the compiler for the host project wouldn't see them), > >the produced static library will still be "providing" these symbols unless we mark > >them as `static` (and if we mark them as `static`, they can't be used outside of their > >translation unit). This means that if the host project has a version of `die` (or links > >against yet another library that does), we run into what C++ calls a One Definition > >Rule (ODR) > >violation: we have two providers of the symbol `die` with different > >implementations, and the behavior is undefined (no error needs to be generated as > >far as I know). > > > >With dynamic libraries I believe that we have more control over what gets exposed, > >but I don't know of functionality for this when linking statically. I'm assuming there > >is no such functionality, as projects like openssl (ty Randall for mentioning this) > >appear to have a convention of prefixing the symbols they put in their "public" API > >(i.e. in non-internal header files) with things like OSSL_, and of prefixing the symbols > >they put in their "private" APIs that can't be marked as `static` with `ossl_`. I'd love > >to be wrong about this. :) > > > >If I'm right that this is an issue, does this imply that we'd need to rename every non- > >static function in the git codebase that's part of a library to have a `git_` prefix, even > >if it won't be used outside of the git project's own .c files? Is there a solution that > >doesn't involve making it so that we have to type `git_` a billion times a day that's > >also maintainable? We could try to guess at how likely a name collision would be > >and only do this for ones where it's obviously going to collide, but if we get it wrong, > >I'm concerned that we'll run into subtle ODR violations that *at best* erode > >confidence in our library, and can actually cause outages, data corruption, and > >security/privacy issues. > > I think we only need to do this for functions that are in the libification code-base for non-static symbols (and any data elements that may end up in a DLL some day). I believe the hope is that the majority/entirety of plumbing code will be provided as a library, and we'll likely want to have a significant portion of porcelain code as well. I think we're really talking about (effectively) "all of git", but not all at once. If we attempt to make things safe based on guesses about what's likely to collide with other project's code, we'll (a) get it wrong, and only discover later when they try to add our library to their project, and (b) have a maintenance burden, where we now have to think about every function name we introduce, which would not be fun (and we'll get it wrong. Frequently.) > The bulk of the non-libified code base would only need to adapt to new symbol names if those symbols are specifically moved. I'm not following what you mean by "moved" here. > die(), error(), are probably going to be impacted, but they can be aliased with #defines internally to git to git_die() or git_error(), for example. > --Randall >