On 11/6/18 9:57 UTC, juice wrote:
During the past half year I have seen systemd dump core three times due to what I suspect a hashmap corruption or race. Each time it looks a bit different and is triggered by different things but it somehow centers on hashmap operations.
Three intermittent hardware failures in one year on 10,000 boxes is normal. Keep good records. If the same box appears twice, then physically destroy it. Meanwhile, log all events to a circular buffer that just keeps rotating: date+time (32 bits, 1 microsecond precision), caller (return address), argument summary (fixed format: string prefixes or hash). Analyze the dump. Lock each hashmap operation to insure single-threaded operation,t; prevent even multiple [supposedly] read-only access. Lock each signal handler: only one instance of a given signal at a time. _______________________________________________ systemd-devel mailing list systemd-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/systemd-devel