On Thu, Dec 13, 2012 at 12:41 PM, Matijn Woudt <tijnema@xxxxxxxxx> wrote: > On Thu, Dec 13, 2012 at 5:13 PM, Jim Giner <jim.giner@xxxxxxxxxxxxxxxxxx>wrote: > >> On 12/13/2012 10:56 AM, Bastien wrote: >> >>> >>> >>> Bastien Koert >>> >>> On 2012-12-13, at 9:10 AM, Jim Giner <jim.giner@xxxxxxxxxxxxxxxxxx> >>> wrote: >>> >>> Thanks for the input gentlemen. Two opposing viewpoints! >>>> >>>> I understand the concept of using files for the docs and a table to >>>> locate them and id them. But I am of the opinion that modern dbs are >>>> capable of handling very large objects (of which these docs are NOT!) much >>>> easier than years ago, so I am leaning that way still. It will certainly >>>> make my search process easier! >>>> >>>> More comments anyone? >>>> >>>> -- >>>> PHP General Mailing List (http://www.php.net/) >>>> To unsubscribe, visit: http://www.php.net/unsub.php >>>> >>>> >>> I got away from storing blobs in the db. I noticed significant slowness >>> after the db grew to about 12gb in MySQL. Back ups also get affected as >>> they take longer. This was older MySQL. But it also affected my mssql >>> server the same way. >>> >>> Nowadays it's files into the file system and data into the db. One thing >>> you could consider is reading the contents of the into a db field and just >>> store the text to allow the full text search >>> >>> Bastien >>> >>> A very clever idea! I like it - the best of both worlds. Can you sum >> up a method for getting the text out of the .doc (or .rtf) files so that I >> can automate the process for my past and future documents? >> Is there a single php function that would accomplish this? > > > There's no builtin function for such stuff. doc files are quite tricky to > parse, but rtf files can be parsed pretty easily. One project is PHPRtfLite > [1], which provides you an API for doing this. > > - Matijn > > [1] http://sourceforge.net/projects/phprtf/ There is http://stackoverflow.com/questions/188452/reading-writing-a-ms-word-file-in-php which has some discussion on reading those files with Antiword (http://www.winfield.demon.nl/) -- Bastien Cat, the other other white meat -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php