On Fri, Jun 26, 2020 at 10:32:14AM +0100, David Kirwan wrote: > Hi all, > > If we are moving towards openshift/kubernetes backed services, we should > probably be sticking with containers rather than Vagrant. We can use CRC > [1] (Code Ready Containers) or minikube [2] for most local dev work. > > I'd be very much in favour of having an Infra managed Prometheus instance > (+ grafana and alertmanager on Openshift), its something I hoped to work on > within CPE sustaining infact. You know, I'm not in love with that stack. It could well be that I just haven't used it enough or know enough about it, but it seems just needlessly complex. ;( I'd prefer we start out at a lower level... what are our requirements? Then, see how we can setup something to meet those. Off the top of my head (I'm sure I can think of more): * Ability to collect/gather rsyslog output from all our machines. * Ability to generate reports of 'variances' from all that (ie, what odd messages should a human look at?) * Handle all the logs from openshift, possibly multiple clusters? * Ability to easily drill down and look at some specifc historical logs (ie, show me the logs for the bodhi-web pods from last week when there was a issue). Perhaps prometheus/graphana/alertmanager is the solution, but there's also tons of other open source projects out there too that we might look into. kevin -- > > > - [1] https://github.com/code-ready/crc > - [2] https://minikube.sigs.k8s.io/docs/ > > > > On Fri, 26 Jun 2020 at 10:23, Luca BRUNO <lucab@xxxxxxxxxx> wrote: > > > On Thu, 25 Jun 2020 15:59:44 -0700 > > Kevin Fenzi <kevin@xxxxxxxxx> wrote: > > > > > > What else would we want in there? > > > > > > Monitoring - we will likely get our nagios setup again soon just > > > because it's mostly easy, but it's also not ideal. > > > > On this one (or more broadly "observability") I'd still like to see an > > infra-managed Prometheus to internally cover and sanity-check the > > "openshift-apps" services. > > I remember this was on the "backlog" dashboard at Flock'19 but I don't > > know if it got translated to an actual action item/ticket in the end. > > > > Ciao, Luca > > _______________________________________________ > > infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx > > To unsubscribe send an email to > > infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx > > Fedora Code of Conduct: > > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > > List Archives: > > https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx > > > > > -- > David Kirwan > Software Engineer > > Community Platform Engineering @ Red Hat > > T: +(353) 86-8624108 IM: @dkirwan > _______________________________________________ > infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx