On Wed, 5 Sep 2007, Jon Smirl wrote:
On 9/5/07, Julian Phillips <julian@xxxxxxxxxxxxxxxxx> wrote:
On Wed, 5 Sep 2007, Jon Smirl wrote:
On 9/5/07, Andreas Ericsson <ae@xxxxxx> wrote:
Jon Smirl wrote:
The path name field needs to be moved back into the blobs to support
alternative indexes. For example I want an index on the Signed-off-by
field. I use this index to give me the SHAs for the blobs
Signed-off-by a particular person. In the current design I have no way
of recovering the path name for these blobs other than a brute force
search following every path looking for the right SHA.
Ah, there we go. A use-case at last :)
But not a brilliant one. You sign off on commits not blobs. So you go
from the sign-off to paths, then to blobs. There is no need to go from
blob to path unless you deliberately introduce such a need.
Use blame for an example. Blame has to crawl every commit to see if it
touched the file. It keeps doing this until it figures out the last
author for every line in the file. Worse case blame has to crawl every
commit in the data store.
And this is advantaged by having the path in the blob how? The important
information here is knowing which commits touched the file - this
information is expensive in git because it is snapshot based. You have to
go back through all the commits looking for changes to the given path.
The information you might want to cache is which commits touched the file,
which you could do without changing the current data storage. Presumably
you are suggesting that such a cache would be cleaner with the filename in
the blob? Or do you think that it would somehow be faster to create? If
so, how?
--
Julian
---
Humor in the Court:
Q: (Showing man picture.) That's you?
A: Yes, sir.
Q: And you were present when the picture was taken, right?
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html