Re: searching non plain text files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Sat, Dec 15, 2018 at 12:20 PM Jeffry Killen <jekillen@xxxxxxxxxxx> wrote:
Hello;

Can anyone point me to instruction/advice about
opening and reading files that are not plain text:

word processing docs, pdf, ps, image files,
even complied code.

I am writing a search function to search file systems
and don't know a lot about the formatting of non plain
text files.

The immediate concern is line breaks in word
processing docs, pdf and ps files.

Then detecting compiled code files so I can
leave them alone. This type of file might not
have a suffix to consider.

Then the various image files that might be
encountered.

Even suffixes aren't a guarantee of the content.

Hi, what about read the different formats using different api.  Then store the textual information on a database for easier searching?
 

Thanks

Jeff K.


--

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux