On Tue, Aug 8, 2017 at 12:52 AM, Jeff King <peff@xxxxxxxx> wrote: > On Mon, Aug 07, 2017 at 03:40:48PM +0000, David Turner wrote: > >> > -----Original Message----- >> > From: Shawn Pearce [mailto:spearce@xxxxxxxxxxx] >> > In git-core, I'm worried about the caveats related to locking. Git tries to work >> > nicely on NFS, and it seems LMDB wouldn't. Git also runs fine on a read-only >> > filesystem, and LMDB gets a little weird about that. Finally, Git doesn't have >> > nearly the risks LMDB has about a crashed reader or writer locking out future >> > operations until the locks have been resolved. This is especially true with shared >> > user repositories, where another user might setup and own the semaphore. >> >> FWIW, git has problems with stale lock file in the event of a crash (refs/foo.lock >> might still exist, and git does nothing to clean it up). >> >> In my testing (which involved a *lot* of crashing), I never once had to clean up a >> stale LMDB lock. That said, I didn't test on a RO filesystem. > > Yeah, I'd expect LMDB to do much better than Git in a crash, because it > relies on flock. So when the kernel goes away, so too does your lock > (ditto if a git process dies without remembering to remove the lock, > though I don't think we've ever had such a bug). > > But that's also why it may not work well over NFS (though my impression > is that flock _does_ work on modern NFS; I've been lucky enough not to > ever use it). Lack of NFS support wouldn't be a show-stopper for most > people, but it would be for totally replacing the existing code, I'd > think. I'm just not clear on what the state of lmdb-on-nfs is. > > Assuming it could work, the interesting tradeoffs to me are: > > - something like reftable is hyper-optimized for high-latency > block-oriented access. It's not clear to me if lmdb would even be > usable for the distributed storage case Shawn has. > > - reftable is more code for us to implement, but we'd "own" the whole > stack down to the filesystem. That could be a big win for debugging > and optimizing for our use case. > > - reftable is re-inventing a lot of the database wheel. lmdb really is > a debugged, turn-key solution. > > I'm not opposed to a world where lmdb becomes the standard solution and > Google does their own bespoke thing. But that's easy for me to say > because I'm not Google. I do care about keeping complexity and bugs to a > minimum for most users, and it's possible that lmdb could do that. But > if it can't become the baseline standard (due to NFS issues), then we'd > still want something to replace the current loose/packed storage. And if > reftable does that, then lmdb becomes a lot less interesting. Peff, thank you for this summary. It echos my opinions as well. On the one hand, I love the idea of offloading the database stuff to lmdb. But its got two technical blockers for me: behavior on NFS, and virtualizing onto a different filesystem in userspace. I really need a specialized reference store on a virtualized distributed storage. The JGit reftable implementation fits that need today. So we're probably going to go ahead and deploy that in our environment. I'd like to start writing a prototype reftable in C for git-core soon, but I've been distracted by the JGit version first. It would be good to have something to compare against the lmdb approach for git-core before we make any decisions about what git-core wants to promote as the new standard for ref storage.