On Oct 28, 2007, at 12:59 AM, Guy Rouillier wrote:
Matthew Wilson wrote:
I have a lot of code -- millions of lines at this point, written
over the last 5 years. Everything is in a bunch of nested folders.
At least once a week, I want to find some code that uses a few
modules,
so I have to launch a find + grep at the top of the tree and then
wait
for it to finish.
I wonder if I could store our source code in a postgresql table and
then use full text searching to index. Then I hope I could run a
query
where I ask for all files that use modules X, Y, and Z.
DBMSs are great tools for the right job, but IMO this is not the
right job. I can't see how a database engine, with all it's
transactional overhead and many other layers, will ever beat a
simple grep performance-wise. I've used Eclipse for refactoring,
but having done it once, I'm sticking with grep.
This is exactly what cscope is good for.
http://cscope.sourceforge.net/
I've used it since the early 90's. I do level 3 support for really
big companies. If you are an emacs fan, its hooked in to it as well.
You want to use the -q option. If it is a million lines of code, its
going to take a while. It pseudo-parses the code (some tricky
constructs will confuse it) and builds a very simple database file.
I think it uses Berkeley's DB file. After that, finding all the
occurrences of foo is a few seconds.
If you want to find just definitions (like where is foo defined),
then use ctags or etags. There is exuberant ctags here:
http://ctags.sourceforge.net/
Perry Smith ( pedz@xxxxxxxxxxxxxxxx )
Ease Software, Inc. ( http://www.easesoftware.com )
Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings