Re: [PATCH v4 20/21] refs: add LMDB refs storage backend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/05/2016 08:44 PM, David Turner wrote:
> Add a database backend for refs using LMDB.  This backend runs git
> for-each-ref about 30% faster than the files backend with fully-packed
> refs on a repo with ~120k refs.  It's also about 4x faster than using
> fully-unpacked refs.  In addition, and perhaps more importantly, it
> avoids case-conflict issues on OS X.
> 
> LMDB has a few features that make it suitable for usage in git:
> 

0. It is licensed under the OpenLDAP Public License, which is apparently
GPL compatible [1].

[1] http://www.gnu.org/licenses/license-list.en.html#newOpenLDAP

> 1. It is relatively lightweight; it requires only one header file, and
> the library code takes under 64k at runtime.
> 
> 2. It is well-tested: it's been used in OpenLDAP for years.
> 
> 3. It's very fast.  LMDB's benchmarks show that it is among
> the fastest key-value stores.
> 
> 4. It has a relatively simple concurrency story; readers don't
> block writers and writers don't block readers.
> 
> Ronnie Sahlberg's original version of this patchset used tdb.  The
> major disadvantage of tdb is that tdb is hard to build on OS X.  It's
> also not in homebrew.  So lmdb seemed simpler.
> 
> To test this backend's correctness, I hacked test-lib.sh and
> test-lib-functions.sh to run all tests under the refs backend. Dozens
> of tests use manual ref/reflog reading/writing, or create submodules
> without passing --ref-storage to git init.  If those tests are
> changed to use the update-ref machinery or test-refs-lmdb-backend (or,
> in the case of packed-refs, corrupt refs, and dumb fetch tests, are
> skipped), the only remaining failing tests are the git-new-workdir
> tests and the gitweb tests.

Would it possible to build your "hack" into the test suite? For example,
perhaps one could implement an option that requests tests to use LMDB
wherever possible and skip the tests that are not LMDB-compatible. Over
time, we could try to increase the fraction of tests that are
LMDB-compatible by (for example) using `git update-ref` and `git
rev-parse` rather than reading/writing reference files via the
filesystem directly.

Otherwise, I'm worried that LMDB support will bitrot quickly because it
will be too much of a nuisance to exercise the code.

> Signed-off-by: David Turner <dturner@xxxxxxxxxxxxxxxx>
> ---
>  .gitignore                                     |    1 +
>  Documentation/config.txt                       |    7 +
>  Documentation/git-clone.txt                    |    3 +-
>  Documentation/git-init.txt                     |    2 +-
>  Documentation/technical/refs-lmdb-backend.txt  |   52 +
>  Documentation/technical/repository-version.txt |    5 +
>  Makefile                                       |   12 +
>  configure.ac                                   |   33 +
>  contrib/workdir/git-new-workdir                |    3 +
>  refs.c                                         |    3 +
>  refs.h                                         |    2 +
>  refs/lmdb-backend.c                            | 2052 ++++++++++++++++++++++++
>  setup.c                                        |   11 +-
>  test-refs-lmdb-backend.c                       |   64 +
>  transport.c                                    |    7 +-
>  15 files changed, 2250 insertions(+), 7 deletions(-)
>  create mode 100644 Documentation/technical/refs-lmdb-backend.txt
>  create mode 100644 refs/lmdb-backend.c
>  create mode 100644 test-refs-lmdb-backend.c
> 
> diff --git a/.gitignore b/.gitignore
> index 1c2f832..87d45a2 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -199,6 +199,7 @@
>  /test-path-utils
>  /test-prio-queue
>  /test-read-cache
> +/test-refs-lmdb-backend
>  /test-regex
>  /test-revision-walking
>  /test-run-command
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 06d3659..bc222c9 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1153,6 +1153,13 @@ difftool.<tool>.cmd::
>  difftool.prompt::
>  	Prompt before each invocation of the diff tool.
>  
> +extensions.refsStorage::
> +	Type of refs storage backend. Default is to use the original
> +	files based ref storage.  When set to "lmdb", refs are stored in
> +	the lmdb database backend.  This setting reflects the refs
> +	backend that is currently in use; it is not possible to change
> +	the backend by changing this setting.
> +

Let's give the users a pointer to how they *can* change this setting. Maybe

	Type of refs storage backend. Default is to use the original
	files based ref storage.  When set to "lmdb", refs are stored in
	an LMDB database.  This setting reflects the refs storage
	backend that was chosen via the --ref-storage option when the
	repository was originally created. It is currently
	not possible to change the refs storage backend of an
	existing repository.


>  fetch.recurseSubmodules::
>  	This option can be either set to a boolean value or to 'on-demand'.
>  	Setting it to a boolean changes the behavior of fetch and pull to
> diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
> index 68f56a7..b9c52ce 100644
> --- a/Documentation/git-clone.txt
> +++ b/Documentation/git-clone.txt
> @@ -227,7 +227,8 @@ objects from the source repository into a pack in the cloned repository.
>  
>  --ref-storage=<name>::
>  	Type of ref storage backend. Default is to use the original files
> -	based ref storage.
> +	based ref storage. Set to "lmdb" to store refs in the lmdb database
> +	backend.

Should the second "lmdb" be upper-case? Maybe also in other places where
the LMDB database is being referred to?

>  
>  <repository>::
>  	The (possibly remote) repository to clone from.  See the
> [...]

I still haven't reviewed the actual lmdb-backend code. I guess I'm going
to have to make myself an LMDB expert one of these days...

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]