mr. maas, psychic me ........ you are a man. i created an index, no change. but i already knew that because one of the cases where an index is never used is when > The key used to fetch the rows is not the same as the one used in the > > ORDER BY: > > which is the case here. (index creation stuff follows) i'm just going to bite the bullet for a few months and do a massive code rehaul. move all the matching code preferably to the sql-server. or pre-compiled C code for the php [ make an extension ]. i think the mb_ functions in php are very slow ( reportedly ) ... so move those somewhere else is a good idea. maybe the 'select ... like' clauses to the sql-server maybe the improvements you suggest will make a difference, maybe not, but when i think about how fast mysql runs its mysqlimport --local --fields-terminated-by='*X*X' dict_explicit french_english french_english compared with mysql < a.sql (bunches of insert statements) that's the kind of improvement i am hoping for. more than 500 queeries a second i am getting now, i need to really really improve that if i am going to parse html on the fly with the technology i have. mysql> describe korean_english; +-----------+---------------------+------+-----+---------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------+---------------------+------+-----+---------------------+----------------+ | wordid | int(11) | | PRI | NULL | auto_increment | | word | varchar(130) | YES | MUL | NULL | | | syn | varchar(190) | YES | | NULL | | | def | blob | YES | | NULL | | | posn | int(2) | YES | | 1 | | | pos | varchar(13) | YES | | 1 | | | submitter | varchar(25) | YES | | NULL | | | doe | datetime | | | 0000-00-00 00:00:00 | | | wordsize | tinyint(3) unsigned | YES | | NULL | | +-----------+---------------------+------+-----+---------------------+----------------+ 9 rows in set (0.00 sec) mysql> explain select -> word,def,wordid,pos,posn,wordsize,syn from korean_english where word -> like '운전할 %' order by wordsize desc; +----+-------------+----------------+-------+---------------+----------+---------+------+------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------------+-------+---------------+----------+---------+------+------+-----------------------------+ | 1 | SIMPLE | korean_english | range | word_idx | word_idx | 391 | NULL | 1 | Using where; Using filesort | +----+-------------+----------------+-------+---------------+----------+---------+------+------+-----------------------------+ 1 row in set (0.01 sec) mysql> CREATE INDEX wordsize_index USING BTREE ON korean_english(wordsize); Query OK, 205265 rows affected (11.16 sec) Records: 205265 Duplicates: 0 Warnings: 0 mysql> explain select word,def,wordid,pos,posn,wordsize,syn from korean_english where word like '운전할 %' order by wordsize desc;e); +----+-------------+----------------+-------+---------------+----------+---------+------+------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------------+-------+---------------+----------+---------+------+------+-----------------------------+ | 1 | SIMPLE | korean_english | range | word_idx | word_idx | 391 | NULL | 1 | Using where; Using filesort | +----+-------------+----------------+-------+---------------+----------+---------+------+------+-----------------------------+ 1 row in set (0.00 sec) mysql> 2006-03-13 (월), 14:04 +0100, Jochem Maas 쓰시길: > ... > > >>>word,def,wordid,pos,posn,wordsize,syn from korean_english where word > >>>like '운전할 %' order by wordsize desc > >> > >>oh would you look at this.... > >>you're ordering by WORDSIZE. > >>stick an index on WORDSIZE!!! > > > > > > > > http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html > > > > > > In some cases, MySQL cannot use indexes to resolve the ORDER BY, > > note 'In Some Cases'. > > so stick an index on WORDSIZE and find out. > > > although it still uses indexes to find the rows that match the WHERE > > clause. These cases include the following: > > > > The key used to fetch the rows is not the same as the one used in the > > ORDER BY: > > > > SELECT * FROM t1 WHERE key2=constant ORDER BY key1; > > > > 2) > > > > > >>you have an iceballs' chance in hell that I'm going to even read > >>the 500+ lines of code that followed here ... let alone try to > >>optimize it. ;-) > >> > > > > > > it was just a 40 line summary of 550 lines of code ........ > > ah, talk about being caught out :-) > regardless a 40line summary won't cut it either - you have to take > the block as a whole. > > > > > > > > > questions i may deem myself to answer: > > > >>>) i get this: > >>>root@www mysql]# tail -f /var/lib/mysql/mysqld_query.log > out > >>> > >>>[root@www mysql]# cat out | wc -l > >>>15910 <<< ----------- that's line count > >> > >>how many queries? > > > > 15, 910 queries > > so roughly (15 * 60) seconds to run 15,000+ > queries and do the processing? that doesn't actually > sound so bad. > > > > > > > > > things already taken care of: > > 1) > > > >>> 9795 Query select > >>>word,def,wordid,pos,posn,wordsize,syn from korean_english where word > >>>like '운전할' order by wordsize desc > >> > >>in cases when you are not using the wildcard tokens (percentage signs) > >>try changing the query to use something like: > >> > >> ... word = '운전할' ... > > > > > > your suggestion and a line from the query_log match exactly. > > > > > > 2) > > > >><snip> > >> > >>>then, it sends each token to CallmatchThis (line 14) which calls > >>>matchThis (line 27 - 47 below) > >>>matchThis may be called twice (2 sql queeries) > >>>(line 51) select * where word = '$token' and another (take that, but if > >>>it's not there .... issue the next sql ) > >>>(line 55) select * where word like '$token%'; > >> > >>Dont do "SELECT *" - always explicitly specify the fields you want. > >> > > > > that was just a paraphrase. the previous email points to the line number > > of the code summary > > never paraphrase code - you only end up with smart ass comments like mine! > > have you reordered you fields in the db yet? adn made as many VARCHARs as > possible into CHARs? > > > > > > >> > >> > >>>------------------------------------------+ > >>>| korean_english | CREATE TABLE `korean_english` ( > >>> `wordid` int(11) NOT NULL auto_increment, > >>> `word` varchar(130) default NULL, > >>> `syn` varchar(190) default NULL, > >>> `def` blob, > >>> `posn` int(2) default '1', > >>> `pos` varchar(13) default '1', > >>> `submitter` varchar(25) default NULL, > >>> `doe` datetime NOT NULL default '0000-00-00 00:00:00', > >>> `wordsize` tinyint(3) unsigned default NULL, > >>> PRIMARY KEY (`wordid`), > >>> KEY `word_idx` (`word`), > >>> KEY `wordid_idx` (`wordid`) > >>>) ENGINE=MyISAM DEFAULT CHARSET=utf8 | > >> > >>reorder the fields so that the VARCHARS are at the end of the > >>table (and the BLOB field at the very, very end of the table). > >> > >>also change VARCHARs to CHARs where you can. > >> > >> > >>>+----------------+----------------------------------------------------------------------------------------------------------------------------------------- > >>>----------------------------------------------------------------------------------------------------------------------------------------------------------- > >>>--------------------------------------------------------------------- > >>> > >>> > >>> > >>> > >>> > >>> 1 function MainLoop() > > > > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php