Action plan for koji01 reboots

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So I'd like to put together an action plan to deal with the koji01
reboot issues. Right now, we're not capturing crash dumps on this
machine (or any other, but I'm not sure there's value in doing so
unless we have an active, systemic problem like we're facing here) -
not saying that there'd be any *to* capture, but there probably are.
I'd like to setup kdump on this machine after the beta freeze is over,
but I'd like buyin from other people before doing it. Here's what I'd
propose:

1) Present another LV from bxen02 to koji01 and mount it at /var/crash
(the rootfs on koji01 is only 10GB, and we'd need more for a crash
dump or two - I'd say 20GB would be sufficient to hold two crashes,
since it's an 8GB domain). It looks like VolGroup01 where koji01 lives
has about 80G free.
2) Install kexec-tools and configure appropriately (includes adding
crashkernel=128M@16M to grub.conf)
3) Reboot machine, and wait for it to crash again.
4) Analyze the (hopefully) resulting crash dumps :)
5) Profit!

Any objections?
_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure

[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux