Re: Working with large datasets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Oct 10, 2011, at 4:27 PM, Thompson, Jimi wrote:

I really think that you should try running it from the command line and see what the issues are. Get both Apache and php out of the way. I've seen some PHP scripts use up all the file handles (OS limit) even on a 64 bit server when they start doing complex things with data sets.

If it works ok without PHP/Apache then you can start lookig at PHP and APache.

ISOLATE the issue not complicate it....

My 2 cents,

Jimi
________________________________________
From: Bastien [phpster@xxxxxxxxx]
Sent: Monday, October 10, 2011 4:19 PM
To: Jason Pruim
Cc: php-db@xxxxxxxxxxxxx
Subject: Re:  Working with large datasets

On 2011-10-10, at 11:30 AM, Jason Pruim <lists@xxxxxxxxxxxxxxxxxxxx> wrote:

Hey everyone,


I am working with a database that has close to 8 million records in it and it will be growing. I have a state field in the data, and I am attempting to test some query's on it, all but 2 records right now have the same state.

My test info won't get pulled up... I believe it keeps timing out the connection.

Is there any advice for working with large datasets? I'm wanting this to be able to load quickly.

Thanks in advance!


Jason Pruim
lists@xxxxxxxxxxxxxxxxxxxx




--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Assuming mysql, what is the my.conf set for? Check that you are using the large dataset one. By default it's usually a small one. That will give you more memory and sort spaces work with the data.

We routinely handle 8-10mm records and it's not tough. The tricks are

1: ensure enough sort space
2: ensure enough memory for large sets
3: ensure about php memory for results
4: try to add additional filters to reduce the data sets. A cardinality of two on a status will always return tons of records and you want to reduce that, maybe with a date range




Bastien Koert
905-904-0334
--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Hi Jason,
I'd start with your max execution and max memory first
and go from there before you change much on your server.
maybe try a .httaccess file in the directory of the php file
that makes the call to allow only that file the extra memory and execution? (I believe I am correct in my thinking that you can specify max memory and max execution per directory or even per file with an .httaccess, please correct me if I am wrong)
I think that and the indexing of your database would help.
After that I would go with retrieving sets of info at a time instead of retrieving the whole database at once.
Or like Bastien stated, narrowing it to certain date ranges.
Then you can create a pagination if your displaying results.

HTH,

Karl DeSaulniers
Design Drumm
http://designdrumm.com


--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [PHP Users]     [Postgresql Discussion]     [Kernel Newbies]     [Postgresql]     [Yosemite News]

  Powered by Linux