Re: More precise tag following

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin wrote:
Ah, I think you fall in the "files matter" trap.

My point is: for what git does it does not need information which might or might not be present, but it derives that information which was there from the beginning: the ancestry path.

Many people don't use or even need blame. And what you want to introduce would affect them, too.

Many people do not use colored diffs.  Introducing colored diff support affects them, too.  In which way?  Additional command line switches, for example.  I don't think that's a big deal, and neither is a reverse map to create object-level DAGs.

That is why I proposed a cache (of precomputed data): you don't have to change _anything_ in the file format, but you can speed the processes up -- locally! -- if they matter to you.

Which means it works on old repositories, too.

Maybe I was not clear enough.  I do not propose to change the file format, but to extend the information stored.  In which way whatsoever.  However I think that keeping this information along with trees in pack files seems very sensible.  Or along pack files, whatever.

It might be sufficient for git.git, but certainly not for projects with a long history. we are talking KDE, FreeBSD, OOo, something like this. They each got about 400k commits. It takes literally *minutes* to get a rev-list or a blame for a certain path. The algorithm simply does not scale. And this has nothing to do with superior output, because hg does it in O(num_of_file_revs), so it *can* be done.

But can hg do it that fast, if you track code _movement_ between files? I doubt so.

I don't know if git can, at the moment, but even if it cannot, in future versions this may well be possible, exactly because we do _not_ rely on metadata to be stored in the objects, which can be derived from the history as-is anyway.

Please don't take the mentioning of hg as an attack on git.  You don't have to shoot back.  It was just to illustrate that this information can be used to speed up certain operations considerably.  Besides, I don't think that hg's repo format prevents it to do things which git can do.  Just some things might be less elegant or easy.

The important part is that you should not change the file format when you do not have to.

Do doubt.  Especially not in a way which breaks backwards compatibility.

Rather, calculate the information you need from the existing data, and if you can reuse it, store it locally. _That_ is flexibility.

Of course this is flexibility.  But this also means that every consumer has to do this for every repo.  Wouldn't it be nice to have it done one time and then stored in a pack?

It also gives me a warm fuzzy feeling that no bogus "auxillary information" can be introduced by fetching from somewhere else. (It does not matter if intended or unintended.)

I agree on that.

And if something is wrong with that "auxillary information", it can be regenerated correctly, without touching the real data -- the commit ancestry.

Yes, it always can be regenerated.  I never said it should be made part of the core structure.

Besides, we already introduced an orthogonal historisation by reflogs, and your method would not cope gracefully with that, would it?
I don't see how reflogs can play into this. After all we're talking about the series of commits the blob experienced to get into its current state, not the series of actions it took this repo to contain this blob.
My point was that you want to introduce a reverse mapping onto the history DAG. But this claims that there is only one history you can possibly look at. This assumption is wrong.

Then you are reading it wrong.  It is just a way to speed up the common way of operation.  That doesn't mean that other ways stop working.  git-rev-list does one thing and you wouldn't call it not being gracefull, just because it doesn't operate on reflogs?

It can make a lot of sense to git-blame a change on a pull, maybe because you don't want to fix it yourself, but throw it all back to the lieutnant whom you pulled that part from.

You could find that pull (in theory; I don't think it works right now) with git-blame walking the _reflogs_ instead of the _commit history_.

Fair enough.  Nobody said that this wouldn't work anymore.  I just said that working on commit history could be sped up considerably.

cheers
 simon

--
Serve - BSD     +++  RENT this banner advert  +++    ASCII Ribbon   /"\
Work - Mac      +++  space for low €€€ NOW!1  +++      Campaign     \ /
Party Enjoy Relax   |   http://dragonflybsd.org      Against  HTML   \
Dude 2c 2 the max   !   http://golden-apple.biz       Mail + News   / \

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]