I have been thinking about how to apply NLP work to logs. The problem is that logs are structured data. It would depend on what you are looking for, what your outcomes are, and how you want to generate it. The other problem is that your problems are going to be very rare events. Rare event detection is its own special mess. Like, if you have 1000 users generating 1 million line log files, how do you pick out the 10 that are bad? You also need to make those rare events representative of very rare events so you might need say, 100 million line log files with a know 100 or 1000 issues to start getting models to work.
Thanks,
~Ben
On Fri, Jul 12, 2019 at 2:44 AM Toomas Kristin <toomas.kristin@xxxxxxxxx> wrote:
Hi,
Basically seems that data science and machine learning are going to be more and more popular at every field of life. I have considered to use machine learning top of logs generated by PostgreSQL servers. However before I start maybe someone already has done that and can share some experience?
Toomas