On Tue, Oct 22, 2019 at 12:34:45PM +0200, Umut Tezduyar Lindskog wrote: > I am curious Zbigniew of how you find out if the coredump was on a starved > process? A very common case is systemd-journald which gets SIGABRT when in a read() or write() or similar syscall. Another case is when systemd-udevd workers get ABRT when doing open() on a device. > This is common for our embedded devices. I didn't think it is common for > desktop too. > It is really useful for getting coredumps on deadlocked applications. For > that reason I don't think it is good to remove this functionality > completely. Yes, I never suggested removing it completely. I'm just saying that for the type of systems that Fedora targets, I don't recall any actual deadlock. For more specialized systems, where the workload is more predictable, it makes sense to have the watchdog. There might be cases where the kernel is dead-locked internally, and e.g. open() or modprobe() never returns. For those cases it might be useful to get the backtrace, but actually killing the process and/or storing the coredump is useful. Zbyszek > > Umut > > On Mon, Oct 21, 2019 at 7:51 PM Zbigniew Jędrzejewski-Szmek < > zbyszek@xxxxxxxxx> wrote: > > > In principle, the watchdog for services is nice. But in practice it seems > > be bring only grief. The Fedora bugtracker is full of automated reports of > > ABRTs, > > and of those that were fired by the watchdog, pretty much 100% are bogus, > > in > > the sense that the machine was resource starved and the watchdog fired. > > > > There a few downsides to the watchdog killing the service: > > 1. if it is something like logind, it is possible that it will cause > > user-visible > > failure of other services > > 2. restarting of the service causes additional load on the machine > > 3. coredump handling causes additional load on the machine, quite > > significant > > 4. those failures are reported in bugtrackers and waste everyone's time. > > > > I had the following ideas: > > 1. disable coredumps for watchdog abrts: systemd could set some flag > > on the unit or otherwise notify systemd-coredump about this, and it could > > just > > log the occurence but not dump the core file. > > 2. generally disable watchdogs and make them opt in. We have > > 'systemd-analyze service-watchdogs', > > and we could make the default configurable to "yes|no". > > > > What do you think? > > Zbyszek > > _______________________________________________ > > systemd-devel mailing list > > systemd-devel@xxxxxxxxxxxxxxxxxxxxx > > https://lists.freedesktop.org/mailman/listinfo/systemd-devel _______________________________________________ systemd-devel mailing list systemd-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/systemd-devel