Re: [PATCH v5 25/27] refs: add LMDB refs storage backend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 18, 2016 at 12:17 PM, David Turner <dturner@xxxxxxxxxxxxxxxx> wrote:
> LMDB has a few features that make it suitable for usage in git:
> ...

I'm reading lmdb documents and hitting  the caveat section [1]. Random thoughts

* "There is normally no pure read-only mode, since readers need write
access to locks and lock file.".

This will be a problem for server side that serves git:// protocol
only. Some of those servers disable write access to the entire
repository and git still works fine (but won't when lmdb is used).
Should we do something in this case? Just tell server admins to relax
file access, or use MDB_NOLOCK (and when? based on config var?)

*  " Use an MDB_env* in the process which opened it, without
fork()ing.". We do use fork on non-Windows in run-command.c, but it
should be followed by exec() with no ref access in between, so we're
almost good.

I notice atexit() is used in this for/exec code, which reminds me we
also use atexit() in many other places. I hope none of them access
refs, or we could be in trouble.

* "Do not have open an LMDB database twice in the same process at the
same time. Not even from a plain open() call - close()ing it breaks
flock() advisory locking"

I wonder what happens if we do open twice, will it error out or
silently ignore and move on? Because if it's the latter case, we need
some protection from the caller and I'm not sure if
lmdb-backend.c:init_env() has it, especially when it open submodule's
lmdb.

* "Avoid long-lived transactions...."

OK we don't have a problem with this. But it makes me realize lmdb
transactions do not map with ref transactions. We don't open lmdb
transaction at ref_transaction_begin(), for example. Is it simply more
convenient to do transactions the current way, or is it impossible or
incorrect to attach lmdb transactions to ref_transaction_*()?

* "Avoid aborting a process with an active transaction. The
transaction becomes "long-lived" as above until a check for stale
readers is performed or the lockfile is reset, since the process may
not remove it from the lockfile."

Does it mean we should at atexit() and signal handler to release
currently active transaction?

* "Do not use LMDB databases on remote filesystems, even between
processes on the same host. This breaks flock() on some OSes, possibly
memory map sync, and certainly sync between programs on different
hosts."

OK can't do anything about it anyway, but maybe it should be mentioned
somewhere in Git documentation.

* "Opening a database can fail if another process is opening or
closing it at exactly the same time."

We have some retry logic in resolve_ref_1(). Do we need the same for
lmdb? Not very important though.

[1] http://symas.com/mdb/doc/
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]