On Tue, Nov 06, 2018 at 02:30:19PM +0200, juice wrote: > Lennart Poettering kirjoitti 2018-11-06 12:27: > > On Di, 06.11.18 11:57, juice (juice@xxxxxxxxxxx) wrote: > > > > > > > > Hi, > > > > > > During the past half year I have seen systemd dump core three times > > > due > > > to what I suspect a hashmap corruption or race. > > > Each time it looks a bit different and is triggered by different > > > things > > > but it somehow centers on hashmap operations. > > > > > > What would be the prefered way to debug this? I cannot add huge > > > logging > > > as this is something that happens once in a blue moon and always in > > > different compute nodes. > > > Is there some way I could easily test it by increasing the chance of > > > such > > > corruption/race happening? > > > > This looks very much like a memory corruption of some sorts and > > valgrind should be the tool of choice to track that down. > > > > Lennart > > Thanks tor the prompt reply, Lennart. > > I agree; using valgrind indeed was something already considered, however I > suspect it might add some overhead in systemd operation? > > The question here was more on the lines how to trigger the problem? > It is quite rare as it seems the occurrance is about once per two months on > our QL3 test pool which contains hunderds of VM guests... > It would be impractical to build and deploy a release which contains systemd > running under valgrind on every node! :) > In such scenarios where valgrind's overhead is impractical, I'd give address sanitizer a try. https://clang.llvm.org/docs/AddressSanitizer.html Regards, Vito Caputo _______________________________________________ systemd-devel mailing list systemd-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/systemd-devel