On Tue, Nov 4, 2008 at 6:49 AM, Erling Ringen Elvsrud <erlingre@xxxxxxxxx>wrote: > Hello list, > > I have been reading and thinking about proper thresholds for > the check_load plugin in Nagios. > > My current understanding of load in Linux: > > The load average over 1,5, and 15 min in Linux is the number of processes > in running, runnable, and uninterruptable sleep states > (according to the load entry in Wikipedia). > According to the same Wikipedia page processes in the uninterruptable state > usually waits for I/O so both CPU-bound and IO-bound processes > can contribute to the load average. > So if we have a server with many I/O-bound processes the > CPU utilization can be low and the load average can be high. > The number of cores or CPUs also determines the impact of the load. > A load of 8 can therefore mean that all cores in a 2 x 4 core-server are > utilized. > > To determine where to set warning and critical thresholds the impact the > load > has on the services running must also be taken into account. For > instance on a system running large batch-jobs a high load can be less > of a problem than > on a system running a webserver where users want a response quickly. > > So if you had a server where you had little knowledge of the services, > how would you pick thresholds for 1,5, and 15 min warning and 1,5, and > 15 min critical? > > Thanks, > > Erling > > -- > redhat-list mailing list > unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe > https://www.redhat.com/mailman/listinfo/redhat-list > while not linux, the rule for solaris/sparc is 5 for all. we use that for solaris/x86 and linux. -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list