Re: Alterations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Note! IMHO

Creating test table:
-------------------------------
CREATE TABLE `thread` (
  `threadid` int(10) unsigned NOT NULL auto_increment,
  `site_user_id` varchar(32) NOT NULL default '',
  `boardid` tinyint(3) unsigned NOT NULL default '0',
  `subject` varchar(200) NOT NULL default '',
  `modified` timestamp(14) NOT NULL,
  `created` timestamp(14) NOT NULL,
  `status` enum('L','P','H','D') NOT NULL default 'L',

  PRIMARY KEY  (`threadid`),
  FULLTEXT KEY `subject` (`subject`),
  KEY `boardid` (`boardid`),
  KEY `site_user_id` (`site_user_id`),
  KEY `created` (`created`),
  KEY `status` (`status`)
) TYPE=MyISAM;

Insert data:
---------------------------------
<?php

  set_time_limit( 6000 );
  error_reporting( E_ALL );

  define ( "NUM", '200000' );

  mysql_pconnect( 'localhost', '---', '------' );
  mysql_selectdb( 'richardboard' );
  
  $name = "Pavel Lobovich, Belarus";
  $text = "Smarty is a template engine for PHP. More specifically, it
facilitates a manageable way to separate application logic and content
from its presentation. This is best described in a situation where the
application programmer and the template designer play different roles,
or in most cases are not the same person. For example, let's say you are
creating a web page that is displaying a newspaper article. ";
  
  $count = NUM;
  $status = array( 'L', 'P', 'H', 'D' );
  while ( $count-- ) {
    
    $sql = 'INSERT INTO `thread` VALUES(NULL, "' . substr( $name, 0,
rand(6,24) ) . '", "' . rand( 1, 100) .'", "' . substr( $text, 0, rand(
1, 255 ) ) . '", NOW(), NOW(), "' . $status[ rand(0,3) ] . '" )';
    mysql_query( $sql );
    
  }
  
?>

Stats:
----------------------
Data: ~27MB
Index: ~34MB
Total: ~62MB

Post action:
----------------------
OPTIMIZE TABLE `thread`;

Query:
----------------------
SELECT * FROM `thread` WHERE `boardid` = "10" AND `status` = "L" ORDER
BY `created` DESC

Explain:
----------------------
thread  	ref  	boardid,status  	boardid  	1  	const  	2182(Hm...) 
	where used; Using filesort

2182(rows) * 650(average) = 1.4MB

Alteration (1):
----------------------
ALTER TABLE `thread` DROP INDEX `boardid`;
ALTER TABLE `thread` DROP INDEX `status`;
ALTER TABLE `thread` ADD INDEX `new_index` ( `boardid`, `status` );

Query:
----------------------
SELECT * FROM `thread` WHERE `boardid` = "10" AND `status` = "L" ORDER
BY `threadid` DESC

Explain:
----------------------
thread  	ref  	new_index  	new_index  	2  	const,const  	360(Good) 
	where used; Using filesort

360(rows) * 650(average) = 234KB


Benckmark script:
----------------------
...
  
  function get_timestamp() {
    list ($sec, $usec) = explode( ' ', microtime() );
    return (float)$sec + (float)$usec;
  }
  
  $start = get_timestamp();
  $sql = [TEST_QUERY];
  $res = mysql_query( $sql );
  
  $end = get_timestamp();
  
  echo $end - $start;
  
...

Benchmark (Duron 1400, 512MB DDR333 PC2700, HD 5400RPM):
----------------------
SELECT * FROM `thread` WHERE `boardid` = "10" AND `status` = "L" ORDER
BY `created` DESC
0.012056972503662 s

SELECT * FROM `thread` WHERE `boardid` = "10" AND `status` = "L" ORDER
BY `threadid` DESC
0.0099239349365234 s

Alteration (2):
----------------------
ALTER TABLE `thread` CHANGE `threadid` `threadid` INT( 4 ) UNSIGNED NOT
NULL AUTO_INCREMENT
Main goals: 
1. Less data size
2. `threadid` is now 32bit value, so comparation of two `threadid`
values is equal "CMP EAX, EBX" Assembler code ( 1 CPU tick ).


Alteration (3)
----------------------
CREATE TABLE thread_subjects (
  thread_ref int(4) unsigned NOT NULL default '0',
  subject varchar(200) NOT NULL default '',
  PRIMARY KEY  (thread_ref)
) TYPE=MyISAM;

INSERT INTO `thread_subjects` SELECT `threadid`, `subject` FROM
`thread`;

Query:
----------------------
SELECT * FROM `thread` LEFT JOIN `thread_subject` ON `thread_ref` =
`thread_id` WHERE `boardid` = "10" AND `status` = "L" ORDER BY
`threadid` DESC

Explain:
thread  	ref  	new_index  	new_index  	2  	const,const  	398  	where
used; Using filesort
thread_subjects 	eq_ref 	PRIMARY 	PRIMARY 	4 	thread.threadid 	1

Time
----------------------
0.015573978424072, but:

398(rows) *  63(average) =  25KB
I think the filesort will be executed more faster

Alteration (4)
----------------------
Change `site_user_id` FROM VARCHAR(32) TO CHAR(32)

Query:
----------------------
SELECT * FROM `thread` LEFT JOIN `thread_subject` ON `thread_ref` =
`thread_id` WHERE `boardid` = "10" AND `status` = "L" ORDER BY
`threadid` DESC

Time:
----------------------
0.013375043869019 =)

Alteration (5):
----------------------
Move `site_user_id` to `thread_subjects`;

398(rows) * 37(average) = 14KB, sorting is more faster
etc...

Best regards, Pavel

> Hello Pavel,
> 
> Tuesday, March 2, 2004, 12:33:17 PM, you wrote:
> 
> PL> http://www.mysql.com/documentation/mysql/bychapter/manual_SQL_Syntax.html#ALTER_TABLE
> PL> ORDER BY allows you to create the new table with the rows in a specific
> PL> order. Note that the table will not remain in this order after inserts
> PL> and deletes. In some cases, it might make sorting easier for MySQL if
> 
> Figured as much, thanks for the reference. I hope one day MySQL will
> produce a manual that isn't one horrendous long document per section.
> 
> PL> You need to split the table and move "post messages" to another table if
> PL> you want to get higher perfomance for (2).
> 
> With regard to disk seeking, here is my table structure:
> 
> CREATE TABLE `thread` (
>   `threadid` int(10) unsigned NOT NULL auto_increment,
>   `site_user_id` varchar(32) NOT NULL default '',
>   `boardid` tinyint(3) unsigned NOT NULL default '0',
>   `subject` varchar(200) NOT NULL default '',
>   `modified` timestamp(14) NOT NULL,
>   `created` timestamp(14) NOT NULL,
>   `status` enum('L','P','H','D') NOT NULL default 'L',
> 
>   PRIMARY KEY  (`threadid`),
>   FULLTEXT KEY `subject` (`subject`),
>   KEY `boardid` (`boardid`),
>   KEY `site_user_id` (`site_user_id`),
>   KEY `created` (`created`),
>   KEY `status` (`status`)
> ) TYPE=MyISAM;
> 
> As well as moving the subject field to another table (because it's the
> only non-defined length field), would another way of speeding up the
> disk seek be to turn it from a varchar(200) into a char(200)? I know
> it means a larger table size, but MySQL should then be able to
> calculate exactly where to jump to in the file?
> 
> PL> All MySQL functions should be very optimized, but there may be some
> PL> exceptions. BENCHMARK(loop_count,expression) is a great tool to find out
> PL> if this is a problem with your query.
> 
> I will look at this now.
> 
> I had been toying with the idea of creating a cache table that held
> the 200 most recent threads, pre-sequenced so I just bring back that
> instead of having to query the database unless they go beyond that 200
> limit.
> 
> Your comments (and MySQL manual posts) have been very useful, thank
> you.
> 
> -- 
> Best regards,
>  Richard Davey
>  http://www.phpcommunity.org/wiki/296.html

-- 
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [PHP Users]     [Postgresql Discussion]     [Kernel Newbies]     [Postgresql]     [Yosemite News]

  Powered by Linux