[RFC PATCH 0/6] [RFC] Introduce cgit-rs, a Rust wrapper around libgit.a

Josh Steadmon <steadmon@xxxxxxxxxx> · Wed, 7 Aug 2024 11:21:25 -0700

When we, the Git team at Google, first embarked on the libification
journey, we didn’t have a specific consumer to build a library for but
instead were interested in the various potential benefits of
libification for many use cases such as VFSes and submodules. Without a
specific consumer, it has been difficult to evaluate the scope of what
is necessary or not even for the first library, git-std-lib. Attempting
to solve problems such as error handling, symbol collisions, and
internal/external interfaces, in addition to separating out a library
turns out to be both too complex of a task to both develop and review
all at once. While we strive to eventually build an ideal library, we
have also realized in order to make meaningful and consistent progress,
we have to solve these problems iteratively in smaller pieces. That is
why over the last month, we have been working with the jj project [1] to
understand their current usage of libgit2-rs [2] and gitoxide [3] and
future library functionality they would be interested in. In doing so,
we have built cgit-rs, a Rust wrapper around libgit.a that allows Rust
code to call various basic Git functions.

[1] https://github.com/martinvonz/jj
[2] https://github.com/rust-lang/git2-rs
[3] https://github.com/Byron/gitoxide

This series provides a small Rust wrapper library around parts of
libgit.a, and a proof-of-concept Rust executable that uses the library
to interface with Git. Additionally, we have tested building JJ with our
library and used it to replace some of the libgit2-rs uses.

This exercise has clarified a lot of things for us, and we believe that
developing this wrapper further provides benefits both for downstream
consumers and the Git project itself:
* cgit-rs provides wrappers for Rust consumers of libraries (eg. jj)
* cgit-rs suggests focus areas for libification
  * shows us what potential challenges we face with library consumers
* Git libification improves git interfaces
* Libification improves cgit-rs FFI.

We are putting error handling on hold for now since it is too complex
and we intend other CLIs to be our first customers, in which case
printing out errors is not the worst.

While the wrapper itself lives in contrib/, there are a couple of
patches that touch git.git code. These patches are necessary for the
wrapper, but not for git.git itself, which may seem unnecessary to
merge. However, I would argue that other languages (not just limited to
Rust) have issues calling functions that require a pointer to
non-generic objects and essentially require a redefinition in their own
language.

We're sending this series as RFC because there is remaining work
we'd like to do, but we'd like to get early feedback on this approach,
and particularly to ask for advice on a few topics:

* alternative methods of exposing only a subset of symbols in our
  library

* bikeshedding on the name (yes, really). There is an active, unrelated
  CGit project [4] that we only recently became aware of. We originally
  took the name "cgit" because at $DAYJOB we sometimes refer to git.git
  as "cgit" to distinguish it from jgit [5].

* gauging the level of interest in calling Git code from Rust

[4] https://git.zx2c4.com/cgit
[5] https://www.eclipse.org/jgit

Remaining work includes:

* finding a better solution to the common-main split. We should probably
  have a separate initialization function including all of main() up to
  the call to cmd_main(), which can then be exposed in cgit-rs.

* adding unit and integration tests

* Makefile cleanup, particularly adding config.mak options that
  developers can set to run Rust builds and tests by default

* automating the process of exporting additional functions via cgit-rs
  (possibly with a wrapper script around bindgen [6])

[6] https://github.com/rust-lang/rust-bindgen

Finally, a quick discussion about symbol collisions: if functions are
not prepended with “libgit_” or something similar, it leaves us open to
collision issues in the future – so this probably would’ve happened with
libification in general to begin with. Therefore it seems necessary to
have to wrap all the symbols we are looking to expose. While this seem
non-ideal, we couldn’t come up with a better method. Our next best
alternative is to simply expose all symbols by default, but this leads
to symbol collisions when library users link both cgit-rs and
libgit2-rs.

Calvin Wan (2):
  contrib/cgit-rs: add repo initialization and config access
  contrib/cgit-rs: add a subset of configset wrappers

Josh Steadmon (4):
  common-main: split common_exit() into a new file
  repository: add initialize_repo wrapper without pointer
  contrib/cgit-rs: introduce Rust wrapper for libgit.a
  config: add git_configset_alloc

 .gitignore                             |  1 +
 Makefile                               | 14 ++++
 common-exit.c                          | 26 +++++++
 common-main.c                          | 24 -------
 config.c                               |  5 ++
 config.h                               |  5 ++
 contrib/cgit-rs/Cargo.lock             | 99 ++++++++++++++++++++++++++
 contrib/cgit-rs/Cargo.toml             | 17 +++++
 contrib/cgit-rs/README.md              | 15 ++++
 contrib/cgit-rs/build.rs               | 33 +++++++++
 contrib/cgit-rs/public_symbol_export.c | 72 +++++++++++++++++++
 contrib/cgit-rs/public_symbol_export.h | 26 +++++++
 contrib/cgit-rs/src/lib.rs             | 81 +++++++++++++++++++++
 contrib/cgit-rs/src/main.rs            | 44 ++++++++++++
 repository.c                           |  9 +++
 repository.h                           |  1 +
 16 files changed, 448 insertions(+), 24 deletions(-)
 create mode 100644 common-exit.c
 create mode 100644 contrib/cgit-rs/Cargo.lock
 create mode 100644 contrib/cgit-rs/Cargo.toml
 create mode 100644 contrib/cgit-rs/README.md
 create mode 100644 contrib/cgit-rs/build.rs
 create mode 100644 contrib/cgit-rs/public_symbol_export.c
 create mode 100644 contrib/cgit-rs/public_symbol_export.h
 create mode 100644 contrib/cgit-rs/src/lib.rs
 create mode 100644 contrib/cgit-rs/src/main.rs

base-commit: 557ae147e6cdc9db121269b058c757ac5092f9c9
-- 
2.46.0.rc2.264.g509ed76dc8-goog