Re: storing & searching docs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 12, 2012 at 01:00:41PM -0500, Jim Giner wrote:

> Slightly off-topic perhaps but I'm looking for general input here.
> 
> New idea for a project - save the minutes of my firehouse meetings
> into a mysql table and build a ui to search them for words and such.
> The docs are written in Word currently.  My simplistic idea is to
> perhaps convert them to something other than Word format and then to
> store them into a field of a mysql record with the meeting date as a
> key field.
> Of course having them online I should also allow for viewing as a
> document in something close to their original (?) format.
> 
> Any ideas - pro or con - on this idea?

First off, I'd convert them to RTF (rich text format). Word format is
too ephemeral ( = self-incompatible). RTF is a lowest common denomenator
which can be converted to a variety of other formats. And RTF is a
standardized format that both Word and things like Open Office both
understand. The formatting for meeting minutes don't dictate a very
complicated layout (something that RTF isn't that good with). I would
suggest HTML format, but Word is notoriously atrocious at faithfully
converting its own formats into HTML. The result is horrid.

Second, you've hit on one of my pet peeves. Never never store huge
blocks of text in SQL files. It slows them down and there's no real
reason for it. There's no reason to force a DBMS to schlep around
massive clumps of text or binary data. That's what disk file systems are
for. Store the target data in a file and store a reference to the
location of the data in the SQL database. Or perhaps, use a NoSQL
solution. I don't know much about the internals of nosql systems, but I
would hope that the metadata about the text objects would be stored
separately from the "payload" (text object).

Paul

-- 
Paul M. Foster
http://noferblatz.com
http://quillandmouse.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux