Search Postgresql Archives

Re: Log Data Analytics : Confused about the choice of Database

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Why not store session as integer? 

And timestamp as timesamp(z?) ? 

If you know the types of events, also store them as integer , and save a map of them in the app or on another table ?

And save the parameters as a json column, so you have more data-types? Hstore only has strings.

Be carefull with the mongodb hipster on the stackoverflow post. Elasticsearch is often used for log collection.

So, what really is the problem ?



On Tue, May 13, 2014 at 4:11 AM, Peeyush Agarwal <peeyushagarwal1994@xxxxxxxxx> wrote:
Hi,

I have log data of the following format:



Session    Timestamp    Event                Parameters
1          1            Started Session      
1          2            Logged In            Username:"user1"
2          3            Started Session
1          3            Started Challenge    title:"Challenge 1", level:"2"
2          4            Logged In            Username:"user2"
Now, a person wants to carry out analytics on this log data (And would like to receive it as a JSON blob after appropriate transformations). For example, he may want to receive a JSON blob where the Log Data is grouped by Session and TimeFromSessionStart and CountOfEvents are added before the data is sent so that he can carry out meaningful analysis. Here I should return:



[
  {
    "session":1,"CountOfEvents":3,"Actions":[{"TimeFromSessionStart":0,"Event":"Session Started"}, {"TimeFromSessionStart":1, "Event":"Logged In", "Username":"user1"}, {"TimeFromSessionStart":2, "Event":"Startd Challenge", "title":"Challenge 1", "level":"2" }]
  },
  { 
    "session":2, "CountOfEvents":2,"Actions":[{"TimeFromSessionStart":0,"Event":"Session     Started"}, {"TimeFromSessionStart":2, "Event":"Logged In", "Username":"user2"}] 
  }
]

Here, TimeFromSessionStartCountOfEvents etc. [Let's call it synthetic additional data] will not be hard coded and I will make a web interface to allow the person to decide what kind of synthetic data he requires in the JSON blob. I would like to provide a good amount of flexibility to the person to decide what kind of synthetic data he wants in the JSON blob.

If I use PostgreSQL, I can store the data in the following manner: Session and Event can be stringTimestamp can be date and Parameters can be hstore(key value pairs available in PostgreSQL). After that, I can use SQL queries to compute the synthetic (or additional) data, store it temporarily in variables in a Rails Application (which will interact with PostgreSQL database and act as interface for the person who wants the JSON blob) and create JSON blob from it.

However I am not sure if PostgreSQL is the best choice for this use case. I have put the detailed question on SO at http://stackoverflow.com/questions/23544604/log-data-analytics

Looking for some help from the community.

Peeyush Agarwal


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux