דניאל דנון wrote:
And you got a point - I don't know all the queries I'll run yet, but I'll probably do them with Perl.
From what you described, it doesn't sound overly complicated to do in PHP either. If you are more familiar with PHP, it will probably take you less time to code it that way. Definitely process the file line-by-line, or a few lines at the time.
And although PHP can handle files that large, shouldn't I split them anyway - in case of some error or debugging, its better to do it before then after, no?
You can use chunks of the file to test the process (and speed up the development cycle not having to wait as long for the job to finish).
BTW, do invoke the script via the command-line, i.e. PHP-CLI. That way, there is no timeout, so that won't bite you.
I've read a bit about working with large databases, But since I haven't used REGEX too much on MySQL queries, I would like to know how long do you think it will take me to do a simple regex search (likes?) on the database? and it will probably appears in most of the entries...
Regex searches on billions of records will be SLOOOOW. Ordinary indexes won't be able to help you much there, though you can experiment with full-text search. It might be the solution you are looking for.
Otherwise, process your data more as you import/prepare it, and use more specific fields, and proper indexes on those fields. How far you can take it will depend a bit on the quality and consistency of your data.
Cheers, Mattias -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php