Chris,
I just assumed that everyone on this list was using MySQL... That's
what we're using (v. 5.1.45 GA) with InnoDB as the engine.
(I just read your tutorial at http://www.designmagick.com/article/16/
It was very helpful - I look forward to checking out more of your articles.)
If it is just as efficient to use multiple separate indexes, that would
make index building less complicated on our large db.
Without a large dataset, it hard to truly test a system and if you have
a large dataset, like we do, it takes quite a while to build indexes.
Our project is a petition signature validation suite. Since many of the
handwritten names and addresses on petition sheets are difficult to
read, the user needs to be able to do some fuzzy searching. Sometimes
it's easier to read the address than it is the name. The zip code is
usually easy to read. We almost always need to use LIKE queries, since
some part of the name or address is typically hard to read. (We try to
use as many of the leading characters as we can and wildcard the remaining.)
For this part of the project joins are not needed. We could eliminate
most of the 55 columns, which would reduce the size of the db, but I
don't know what that would do to speed...
Does any of this affect the approach you would suggest?
Thanks!!
- Ron
On 3/17/2010 11:56 PM, Chris wrote:
listread wrote:
Greetings!
We are working on a 4 million record db with about 55 fields. We
need to build indexes to speed queries, but it seems to take way too
long.
I'm assuming your using mysql but that's just a guess.
What database? (mysql, oracle, postgres, firebird, sqlite, other)
If it's mysql, what db engine?
Here are some questions:
1) If a query uses two or three fields is it best to prepare an index
with those two or three fields, or will the query be just as
efficient if we build three separate index, one for each field?
Relates to above, but assuming mysql then it depends on what version.
multicolumn indexes work well in all mysql versions. separate indexes
can be used and combined in newer versions (v5 I think it came in).
2) Is the index building process any slower, per index, if you are
building more than one index at a time?
Most indexing I think takes an exclusive lock on the table, so index 1
gets created, index 2 has to wait it's turn until index 1 finishes.
I wrote an article on this a while ago, might give you some more
insights: http://www.designmagick.com/article/16/
--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php