On 5/15/2019 7:32 PM, Eric S. Raymond wrote: > Derrick Stolee <stolee@xxxxxxxxx>: >> On 5/15/2019 3:16 PM, Eric S. Raymond wrote: >>> The deeper problem is that I want something from Git that I cannot >>> have with 1-second granularity. That is: a unique timestamp on each >>> commit in a repository. >> >> This is impossible in a distributed version control system like Git >> (where the commits are immutable). No matter your precision, there is >> a chance that two machiens commit at the exact same moment on two different >> machines and then those commits are merged into the same branch. > > It's easy to work around that problem. Each git daemon has to single-thread > its handling of incoming commits at some level, because you need a lock on the > file system to guarantee consistent updates to it. > > So if a commit comes in that would be the same as the date of the > previous commit on the current branch, you bump the incoming commit timestamp. This changes the commit, causing it to have a different object id, and now the client that pushed that commit disagrees with your machine on the history. > That's the simple case. The complicated case is checking for date > collisions on *other* branches. But there are ways to make that fast, > too. There's a very obvious one involving a presort that is is O(log2 > n) in the number of commits. > > I wouldn't have brought this up in the first place if I didn't have a > pretty clear idea how to do it in code! > >> Even when you specify a committer, there are many environments where a set >> of parallel machines are creating commits with the same identity. > > If those commit sets become the same commit in the final graph, this is > not a problem for total ordering. > >>> Why do I want this? There are number of reasons, all related to a >>> mathematical concept called "total ordering". At present, commits in >>> a Git repository only have partial ordering. >> >> This is true of any directed acyclic graph. If you want a total ordering >> that is completely unambiguous, then you should think about maintaining >> a linear commit history by requiring rebasing instead of merging. > > Excuse me, but your premise is incorrect. A git DAG isn't just "any" DAG. > The presence of timestamps makes a total ordering possible. > > (I was a theoretical mathematician in a former life. This is all very > familiar ground to me.) Same. But you seem to have a fundamental misunderstanding about the immutability of commits, which is core to how Git works. If you change a commit, then you get a new object id and now distributed copies don't agree on the history. >>> One consequence is that >>> action stamps - the committer/date pairs I use as VCS-independent commit >>> identifications in reposurgeon - are not unique. When a patch sequence >>> is applied, it can easily happen fast enough to give several successive >>> commits the same committer-ID and timestamp. >> >> Sorting by committer/date pairs sounds like an unhelpful idea, as that >> does not take any graph topology into account. It happens that commits >> can actually have an _earlier_ commit date than its parent. > > Yes, I'm aware of that. The uniqueness properties that make a total > ordering desirable are not actually dependent on timestamp order > coinciding with topo order. > >> Changing the granularity of timestamps requires changing the commit format, >> which is probably a non-starter. > > That's why I started by noting that you're going to have to break the > format anyway to move to an ECDSA hash (or whatever you end up using). > > I'm saying that *since you'll need to do that anyway*, it's a good time > to think about making timestamps finer-grained and unique. That change is difficult enough as it is. I don't think your goals justify making this more complicated. You are also not considering: * The in-memory data type now needs to be a floating-point type, or an even larger integer type using a different set of units. * This data type now affects our priority queues for commit walks, how we store the commit date in the commit-graph file, how we compute relative dates for 'git log' pretty formats. -Stolee