On Tue, Feb 20, 2018 at 6:52 AM, Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote: > On Mon, Feb 19, 2018 at 12:22 PM, Sage Weil <sweil@xxxxxxxxxx> wrote: >> What if we update the segv crash handler to, in addition to dumping the >> recent log and stack trace to the log file, also >> >> - writes the same information to a standalone file, e.g. >> /var/lib/ceph/crashes/$type.$id/$timestamp >> - make the daemon check for previous crashes on startup, and report them >> to the mgr >> - make the mgr keep some record of previous crashes (if not the full log, >> just the timestamp so we know when it happened) >> - index/fingerprint by stack trace? >> - surface a health warning for recent crashes? >> - make an opt-in mgr function that works similar to python's sentry: post >> the crash report to some central archive where developers will hear about >> it. > > +1 > > There was a very useful project that I can't find anymore done by > Google which would allow the segfault handler to create a coredump and > save it to a file (via forking a helper process I believe). Relying on Was it breakpad, https://github.com/google/breakpad/blob/master/docs/getting_started_with_breakpad.md ? > the operating system kernel core dump configuration sucks and it'd be > nice to just do our own thing to collect coredumps persistently. > > Using that we could then even collect coredumps at the ceph-mgr for > basic processing like generating a backtrace using a ceph daemon > executable with debugging symbols. That backtrace would be > significantly more useful when sending a crash report to the central > archive. > > -- > Patrick Donnelly > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Cheers, Brad -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html