Re: Compression and dictionaries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just a note: there are index structures that support regular expression
searching.  In particular, a PAT tree, usually represented implicitly as
a PAT array, can be walked by a finite automaton to find all the places
it matches.

However, there's a lot of code complexity associated with that.  And a
PAT array assumes efficient random access to the text being indexed,
as it does not keep a copy of the text.


Perhaps most importantly, this would be a big change to "git grep", as
it would search every object in the database, not a particular commit.
And mapping objects back to filenames in trees and commits requires
another index.


Compression dictionaries and indexes have some opposing points.  In a
compression dictionary, you prefer common words that appear in many places.
For an index, you prefer rare words that identify a small set of files.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]