Greeting everyone Thanks to darknao, we have just enabled monitoring by default for our OpenShift applications. Note that it will not be active until the next run of the playbook pushes it out. I will look at running playbooks over the next few days for most projects, but if you are an appowner and want it sooner, just run your playbook (and let me know I don't need to). Some notes: By default it alerts on the things in ./roles/openshift/project/templates/prometheusRules.yml Which includes cronjobs failing, pods crashing for various reasons, etc. We can look at expanding this if there's other things that are generally good to monitor. Alerts trigger and by default send email to appowners. You can optionally set alert_users list in your playbook if you like and it will only send to those users (not to appowners). If for some reason you do not want any of this monitoring on your application you can set: alerting: False to avoid it. I'd really like to know why if you plan on doing that however. Hopefully this will help us see when things aren't working right before we get user reports about it. :) Many thanks again to darknao for setting this up. :) kevin
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue