I think im good with a text for the db and search capability and the pdf for pure display jg On Dec 15, 2012, at 5:31 PM, tamouse mailing lists <tamouse.lists@xxxxxxxxx> wrote: > On Sat, Dec 15, 2012 at 11:22 AM, Jim Giner > <jim.giner@xxxxxxxxxxxxxxxxxx> wrote: >> On 12/15/2012 8:29 AM, tamouse mailing lists wrote: >>> >>> On Dec 13, 2012 4:50 PM, "Jim Giner" <jim.giner@xxxxxxxxxxxxxxxxxx> wrote: >>>> >>>> >>>> Thanks for all the posts. After reading and googling all afternoon, I >>> >>> think the best approach for me is: >>>> >>>> >>>> Create two macros in Word (done!) to export each of my .doc files to .txt >>> >>> and .pdf formats. >>>> >>>> >>>> Create a sql table to hold the .txt contents of my .doc files, along with >>> >>> a reference to the meeting date and the name of the corresponding .pdf >>> file. >>>> >>>> >>>> Upload my two sets of files with an ftp client and then use a script to >>> >>> load the table with my .txt file data. >>>> >>>> >>>> Now I just need a couple of scripts to allow a user to locate a file and >>> >>> bring up the pdf for when he wants to read about a meeting. And a second >>> script to accept user input (search words) and perform a query against the >>> textual data and present some kind of results - probably a listing >>> containing a reference to the meeting date and a tbd-length string showing >>> the matching result for each occurrence, ie, something like n chars in >>> front of and after the match so the user can see the context of the match. >>>> >>>> >>>> Sizes - a 28k .doc file grows to 142kb in .pdf format and is only 5kb in >>> >>> .txt format. (actually, if I 'print' the .doc as a pdf instead of using >>> the Word's "File,Save as", the resulting pdf is only 70kb. Might need a >>> new macro!) >>>> >>>> >>> >>> PDF might be better looking than this, but how big is an HTML doc exported >>> from Word? >>> >>>> Thanks again! >>>> >>>> >>>> -- >>>> PHP General Mailing List (http://www.php.net/) >>>> To unsubscribe, visit: http://www.php.net/unsub.php >>>> >>> >> Word generates very many many words (!) when creating an html doc. Not a >> good html generator at all. >> >> >> -- >> PHP General Mailing List (http://www.php.net/) >> To unsubscribe, visit: http://www.php.net/unsub.php >> > > I think my next email talked about sending the HTML through pandoc to > make a plain text file, perhaps in markdown, which could be the thing > you save, and then run it through a markdown filter to produce (a > much, much leaner) HTML. > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php