Re: Performance question

Chris <dmagick@xxxxxxxxx> · Thu, 02 Feb 2006 14:08:10 +1100

Hi,

Any time you fetch results from a database it take up memory, you can't 
do much about that (you can limit the effect by using 'limit' in 
conjunction with paging and only getting the columns you need etc but 
that's about it).

If you're using a standard id/parentid type approach you're stuck with 
recursive sql calls.

If you want to take a different approach you could use a tree/node system.

You end up with stuff like this:

 categoryid |   node   | categoryname |
------------+----------+--------------+
          1 | 0001     | xxx          |
          2 | 00010002 | 111          |

(categoryid '2' is a subcategory of categoryid '1').

Then you can quickly find the path of a category using just sql.

The downside is moving categories/parents around is a pain.

I haven't tried this myself but having multiple parents you could have a 
separate table for just the joins with the 'node' (read: path) in it and 
it basically includes the path to the end category (which you join with 
your 'category' table to get the name etc).

If you need more info let me know.

Mathieu Dumoulin wrote:
Miles Thompson wrote:

At 12:02 PM 2/1/2006, Mathieu Dumoulin wrote:

This is more a "How would you do it" than a "How can i do it" question.

Didn't have time to try it, but i want to know how mysql_seek_row 
acts with large result sets.

For example im thinking of building a node tree application that can 
have dual direction links to nodes attached to different places.

I was wondering if i could actually have two result sets that query 
everything sorted by ID (Links and Nodes) then just seek the rows i 
need instead of dumping everything in php memory. When i mean large i 
mean really large, more than the standard possible 2 mbs of data 
allowed by most php servers.

That's where the "how you'd do it" comes into play. I think i'd just 
query my tables, loop them but keep only the line (to do a data_seek 
later on) and ID in some kind of an hash table or simply an array. 
This would make it relatively fast without taking too much memory.

This is my solution, how do you people see it without dumping 
everything to memory or by making recursive SQL calls (which will 
obviously slow everything down i'm pretty sure)

Mathieu Dumoulin
Programmer analyst in web solutions
mdumoulin@xxxxxxxxxxxxxxx

I don't get it - what are you trying to do? Why? What is being 
visualized in the tree structure?

Building trees is expensive, you're essentially duplicating the index 
of the database, and if you fetch the data it's in memory regardless 
of how you present it.

The mysql_data_seek needs the row number - but by  then you're on the 
client side, if this is to respond to a link.

Maybe I just don't see it - or maybe you're confusing server side and 
client side.

Regards - Miles Thompson

I'll try to explain my post better since people dont seem to understand 
it. I'm trying to illustrate a node tree client side, everything is 
considered a node in a tree. Now these nodes can appear more than once 
since i allow a node to be linked to different elements (Not just one 
parent, i may need to refer to a node several times.

Considering that a node may be there twice or more, that the number of 
nodes to work with may be quite high, i dont want to start copying into 
a php memory structure everything from the database that i might need... 
Nor do i want to recursively call SQL queries getting childrens of every 
parent. (That would cost a lot to the database server)

So i was wondering what would be the best idea to actually implement 
this... "dilema" of keeping it fast and not using too much memory.

As  i understand your reply Miles, the query im doing resides in the php 
memory even if i don't fetch the data? If you do confirm that i WILL 
have to resort to recursive SQL calls. :(

To finish my post i'd re-explain rapidly my vision of doing this, but 
obviously i may be mistaken, maybe i can't do this. Taking two sql 
queries, one for the nodes, one for the links between nodes, i'd first 
read the results sorted by id, and do a mapping table for each result. 
When i need to use some info, i'd map the id of this info to the mapping 
table and retrieve the row from the result residing in mysql memory. (Up 
to now, this seemed logical until Miles told me that results are kept in 
PHP memory.) If this is still a valid way of doing it great. What would 
be the impact on the sQL server. Would this be a good way? I'd like to 
know from other pros how they'd implement it.

Mathieu Dumoulin

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php