On Wed, Dec 12, 2012 at 01:00:41PM -0500, Jim Giner wrote: > Slightly off-topic perhaps but I'm looking for general input here. > > New idea for a project - save the minutes of my firehouse meetings > into a mysql table and build a ui to search them for words and such. > The docs are written in Word currently. My simplistic idea is to > perhaps convert them to something other than Word format and then to > store them into a field of a mysql record with the meeting date as a > key field. > Of course having them online I should also allow for viewing as a > document in something close to their original (?) format. > > Any ideas - pro or con - on this idea? First off, I'd convert them to RTF (rich text format). Word format is too ephemeral ( = self-incompatible). RTF is a lowest common denomenator which can be converted to a variety of other formats. And RTF is a standardized format that both Word and things like Open Office both understand. The formatting for meeting minutes don't dictate a very complicated layout (something that RTF isn't that good with). I would suggest HTML format, but Word is notoriously atrocious at faithfully converting its own formats into HTML. The result is horrid. Second, you've hit on one of my pet peeves. Never never store huge blocks of text in SQL files. It slows them down and there's no real reason for it. There's no reason to force a DBMS to schlep around massive clumps of text or binary data. That's what disk file systems are for. Store the target data in a file and store a reference to the location of the data in the SQL database. Or perhaps, use a NoSQL solution. I don't know much about the internals of nosql systems, but I would hope that the metadata about the text objects would be stored separately from the "payload" (text object). Paul -- Paul M. Foster http://noferblatz.com http://quillandmouse.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php