On Mon, Jan 18, 2021 at 04:25:09PM +0100, Pierre-Yves Chibon wrote: > Good Morning Everyone, > > While planning work, the CPE team has realized that a number of our initiatives > actually start with a research phase to find the most appropriate technical > solution. > This leads to some issues with planning as without knowing the technical > solution we want to take, it's hard to evaluate the amount of work needed and > thus the time it'll take to do it. > > In order to help with this, we're creating a small sub-team in CPE, called the > ARC team for Advance Reconaissance Crew*. > The goal of this team will be to investigate what we believe to be the possible > technical solutions for initiatives and advise the team on what they believe > would be the appropriate solution. > To this end, we will reach out when we start looking for ideas as you may have > ideas that we did not think about. > > The first investigation, led by Will Woods, Mark O'Brien and I, will be around > datanommer and datagrepper. > > datanommer is an application listening to fedmsg and filling a (postgresql) > database with all the messages passing on the bus. > datagrepper is a web application exposing these messages and offering a way to > filter or search them. > available at: https://apps.fedoraproject.org/datagrepper/ > > Currently our ideas are: > - for datanommer: > - port it to fedora-messaging > - adjust it to whichever solution we chose to replace datagrepper > > - for datagrepper: > - keep it as is > - Replace by > - postgres https://postgrest.org/ > - prest https://github.com/prest/prest > - kinto https://docs.kinto-storage.org/en/stable/ > - Swagger/OpenAPI https://swagger.io/ > - Add support for Graphql > > - for the postgresql server > - Split messages per year in different table > - Unite them using a postgresql view > - Kick out the old messages per year > - Keep the current year + n-1 in the current DB > - Kick the other to another DB? > - Kick the other to a tarball somewhere? > - Output the database daily dump to file / year > - TimescaleDB a postgresql plugin for time-series data > - https://alibaba-cloud.medium.com/postgresql-time-series-database-plug-in-timescaledb-deployment-practices-6a07e246eb0d > - https://dev.t-matix.com/blog/postgresql-as-a-time-series-database/ > - https://docs.timescale.com/latest/introduction > - Make the msg field in the message table be a JSON field > > Would you have any other ideas of things we could look at? Just as a follow up to this thread, our findings can be found at: https://fedora-arc.readthedocs.io/en/latest/datanommer_datagrepper/index.html and I've also presented them in a blog post at: http://blog.pingoured.fr/index.php?post/2021/02/26/datanommer/datagrepper-investigations Hoping this helps, Pierre _______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure