Re: topological index field for commit objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 30, 2016 at 11:12:52AM -0700, Linus Torvalds wrote:

> I do think that it's ok to cache generation numbers somewhere if there
> is an algorithm that can make use of them, but every time this comes
> up, it's just not been important enough to make a big deal and a new
> incompatible object format for it. The committer date is preexisting
> and has existing pseudo-generation-number usage, so..improving on the
> quality of it sounds like a good idea.

If you are OK with a cache, I don't think one needs to change the object
format at all. It can be computed on the fly, and is purely a local
optimization.

> The first step should probably be to make fsck warn about the existing
> cases of "commit has older date than parents". Something like the
> attached patch, perhaps?

I have mixed feelings on this, because it forces the user to confront
the issue at a time that's potentially very far from when it actually
happened (and often when it is not their fault).

I expect most people don't run fsck at all under normal circumstances,
so it is only when they have a problem of some sort that they would see
this warning at all. It would kick in and prevent objects being
transferred when things like receive.fsckObjects are configured, but
it's not on by default. GitHub does enable it, so pushing there is often
the first notification people get about any kind of problem.

This is a general problem with all fsck-driven warnings (people may
fetch without realizing they're getting breakage, and such objects may
get years of history built on top). But I think it can be even worse
with timestamps, because the broken state may not even be recognizable
when you first fetch the troublesome object.

E.g., imagine somebody else has their clock set forward, and you fetch
from them. Their object by itself is not broken. It is only when you
want to commit on top of it, with the correct clock, that the broken
state is created (and then, we cannot know whether it is your clock or
the original committer's clock that is bad).

So I think it would be more productive to put a check like this in "git
commit" rather than (or perhaps in addition to) fsck. That prevents
us creating the broken relationship, but it does still mean the user may
have to to go back and tell the original committer that their clock was
broken.

You could also have the fsck check look not only for out-of-order
commits, but also commits in the future (from the perspective of the
receiver). That would reject such broken commits before they even hit
your repository (though again, it is unclear in such a case if the
commit is broken or the clock of the checker).

> +static void fsck_commit_date(struct fsck_options *options, struct commit *commit)
> +{
> +	struct commit_list *p;
> +
> +	for (p = commit->parents; p; p = p->next) {
> +		struct commit *parent = p->item;
> +		if (commit->date < parent->date)
> +			report(options, &commit->object, FSCK_MSG_DATE_ORDERING, "Bad commit date ordering with parent");
> +	}
> +}

I didn't test it, but I suspect this won't work reliably, as we do not
always have the parents parsed during an fsck check (e.g., during
"index-pack --strict").

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]