Re: storing & searching docs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 13, 2012 at 12:41 PM, Matijn Woudt <tijnema@xxxxxxxxx> wrote:
> On Thu, Dec 13, 2012 at 5:13 PM, Jim Giner <jim.giner@xxxxxxxxxxxxxxxxxx>wrote:
>
>> On 12/13/2012 10:56 AM, Bastien wrote:
>>
>>>
>>>
>>> Bastien Koert
>>>
>>> On 2012-12-13, at 9:10 AM, Jim Giner <jim.giner@xxxxxxxxxxxxxxxxxx>
>>> wrote:
>>>
>>>  Thanks for the input gentlemen.  Two opposing viewpoints!
>>>>
>>>> I understand the concept of using files for the docs and a table to
>>>> locate them and id them.  But I am of the opinion that modern dbs are
>>>> capable of handling very large objects (of which these docs are NOT!) much
>>>> easier than years ago, so I am leaning that way still.  It will certainly
>>>> make my search process easier!
>>>>
>>>> More comments anyone?
>>>>
>>>> --
>>>> PHP General Mailing List (http://www.php.net/)
>>>> To unsubscribe, visit: http://www.php.net/unsub.php
>>>>
>>>>
>>> I got away from storing blobs in the db. I noticed significant slowness
>>> after the db grew to about 12gb in MySQL. Back ups also get affected as
>>> they take longer. This was older MySQL. But it also affected my mssql
>>> server the same way.
>>>
>>> Nowadays it's files into the file system and data into the db. One thing
>>> you could consider is reading the contents of the into a db field and just
>>> store the text to allow the full text search
>>>
>>> Bastien
>>>
>>>  A very clever idea!  I like it - the best of both worlds.  Can you sum
>> up a method for getting the text out of the .doc (or .rtf) files so that I
>> can automate the process for my past and future documents?
>> Is there a single php function that would accomplish this?
>
>
> There's no builtin function for such stuff. doc files are quite tricky to
> parse, but rtf files can be parsed pretty easily. One project is PHPRtfLite
> [1], which provides you an API for doing this.
>
> - Matijn
>
> [1] http://sourceforge.net/projects/phprtf/


There is http://stackoverflow.com/questions/188452/reading-writing-a-ms-word-file-in-php
which has some discussion on reading those files with Antiword
(http://www.winfield.demon.nl/)

-- 

Bastien

Cat, the other other white meat

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux