Re: Possible regression in git-rev-list --header

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Sun, 31 Dec 2006, Marco Costalba wrote:

> Regarding qgit parsing 'bug' I would like to point out something that
> probably is not clear.

But it _is_ a bug!

> 1) Parsing routine _must_ be able to sustain the loading of more then
> 40000 revisions in a couple of seconds, so must be very quick. A lot
> of effort has been put to index the header info at maximum speed. Now
> it takes about 300ms to parse the whole linux tree. You can have this
> only if the header format is 'fixed enough', it means that you would
> not expect whole new lines (new '\n' chars) to appear from nowhere in
> header, with the exception of log message and parents info lines of
> course.

I don't see why this would slow down parsing _at all_. Besides, you should 
really stop relying on the header format being fixed for all eternity. 
There has been talk about putting more useful information into the header, 
and there _are_ valid reasons to keep the header extensible.

Further, if you rely on parsing being super-fast, why not just parse 
_only_ the header information that you actually need? The header still 
consists of

	- exactly one "tree",
	- an arbitrary amount of "parent" lines,
	- exactly one "author", and
	- exactly one "committer" line

After that may come optional headers, but by that time you should 
_already_ have stopped parsing! And the order is fixed already 
(parse_commit_buffer() relies on it).

After all, you have an initial parsing for the purpose of organizing the 
commits, and you can have _another_ for the purpose of displaying the 
message (you can remember the offset where the first parsing stopped to 
accelerate the second). The latter parsing should be done individually, 
when displaying the commit.

And I still have to disagree with Junio that the encoding header is no 
longer needed when displaying the commit message. The "tree" and "parent" 
headers are also displayed, even if their information is already used for 
purposes of displaying them.

The commit header contains information about that particular commit, and 
if I ask to see the headers, I want to see them, and not be treated like 
an idiot who does not know how to handle that information.

(If I ask for git-log to show everything encoded in Latin-1, it might 
still be interesting to know who used which encoding. And if it is 
displayed in my local encoding, but the commit header says UTF-8, I _do_ 
know that this is the original encoding, not the displayed one, thank you 
very much!)

So please, Marco, fix that bug in qgit. Otherwise you will restrict our 
ability to enhance commit objects with useful meta information _anyway_. 
IOW, even if the encoding header is not shown (which I would not like), 
you should fix that bug.

Ciao,
Dscho

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]