Re: A look at some alternative PACK file encodings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A few notes:


Re: base-128 encodings, it's a pet peeve of mine that meny people, even
while trying to save space, waste it by allowing redundant encodings.
The optimal way, assming msbit=1 means "more", is

	0x00 -> 0		0x01 -> 1
	0x7f -> 127		0x80 0x00 -> 128
	0x80 0x7f -> 255	0x81 0x00 -> 256
	0xfe 0x7f -> 16383	0xff 0x00 -> 16384
	0xff 0x7f -> 16511	0x80 0x00 0x00 -> 16512

The decoding code can be written several ways, but try:

	c = *p++;
	x = c & 127;
	while (c & 128) {
		c = *p++;
		x = ((x + 1) << 7) + (c & 127);
	}

encoding is most easily done in reverse:

	char buf[9];
	char *p = buf+8;

	*p = x & 127;
	while (x >>= 7)
		*--p = 0x80 | (--x & 127);

	write_out(p, buf + 9 - p);


If you want to do signed offsets, the easiest way to write the code
is to convert to an unsigned with the sign bit as lsbit:

	u = (s << 1) ^ -(s < 0);

And then feed the resulting unsigned number to the functions above.
Handling the sign bit specially in the encoding is messier.


And finally, regarding qsort(), stability is not guaranteed, but glibc
actually uses a stable merge sort if it can allocate the memory.  See
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/msort.c?rev=1.21&cvsroot=glibc

This leads to the slightly annoying question of whether it's worth
writing a stable sort just to make git work a little bit better
on non-glibc platforms, when it doesn't affect most current users
personally and it still works correctly with an unstable qsort.

It might be simplest to simply lift the glibc mergesort implementation,
possibly inlining the compares for efficiency.  If you touch the code,
please (my eyes!  it hurts us!) consider fixing the brace style.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]