Re: A tracking tree for the active work space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/11/07, Junio C Hamano <junkio@xxxxxxx> wrote:
"Jon Smirl" <jonsmirl@xxxxxxxxx> writes:

> Reading the other thread on tracking temporary changes made me think
> of using inotify with git. The basic idea would be to a daemon running
> that uses inotify to listen for changes in the working tree. As these
> changes happen they get committed to a tracking tree.

I think it is an interesting idea, but can be used with any SCM
not just git ;-).

As for the part about 'git grep'  Shawn and I have been talking off
and on about experimenting with an inverted index for a packfile
format. The basic idea is that you tokenize the input and turn a
source file into a list of tokens. You diff with the list of tokens
like you would normally do with text. There is a universal dictionary
for tokens, a token's id is it's position in the dictionary.

Tokenized text is one of the most compact compression schemes known.
It can get even more compact by tokenizing common phrases and using
variable length token ids. Compression schemes like this are used in
web search engines. Of course you keep a check in place for input that
doesn't tokenize (binary) and fallback to gzip.

To build 'git grep' you make a bitmap index for each token in the
dictionary and put a one in it if the file has the token. Gzip these
indexes and then there are algorithms for doing and/or operations on
the zipped indexes without expanding them. grep is almost instant over
gigabytes of text if indexes like this are available.

Keeping everything up to date on a dual core system is pretty much
free since that second core is rarely doing anything while you are
editing.

--
Jon Smirl
jonsmirl@xxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]