>> > So if a service or host is unreachable for 3 or 4 mins, we get a >> > notification. (However most of the cases it is false positive, due to >> > congestion or others). >> Looking through my email, from what I can recall there are no false >> positives. xen6 had to be power-cycled which caused all the other >> collateral notifications. > > > How long was it down? Why should a normal reboot will send 23 mails? > Reboot is not any exceptional thing. Is it? > An alert should be when its absolutely necessary... > it should report only when xen6 comes up but a service does not come up.. > What do you think? > Thanks. Remembering that unresponsive and down are different things it looks like it went unresponsive ~0210 UTC (2-3 minutes before first email) - I *think* this might have just being domU's at that point, from IRC logs it looks like the dom0 was rebooted sometime around 0228 (potentially before hand I do not know). It's 1 email per checked item for down/up and I guess in perspective, it was quite big... IMO these reports are 'absolutely necessary' and I personally like to check it every now and then (especially after an outage like this to see if everything was back up (service/host overview on nagios web is handy for this). - Nigel > > > > -- > Regards, > Susmit. > > ============================================= > ssh > 0x86DD170A > http://www.fedoraproject.org/wiki/SusmitShannigrahi > ============================================= > > _______________________________________________ > Fedora-infrastructure-list mailing list > Fedora-infrastructure-list@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list > _______________________________________________ Fedora-infrastructure-list mailing list Fedora-infrastructure-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list