If you can do an ssh session to the IPMI console and then do that inside of a screen, you can save the output of the screen to a file and look at what was happening on the console when the server locked up. That's how I track kernel panics.
On Fri, Oct 27, 2017 at 1:53 PM Bogdan SOLGA <bogdan.solga@xxxxxxxxx> wrote:
BogdanKind regards,Thanks, a lot!Thank you very much for the reply, Ilya!The server was completely frozen / hard lockup, we had to restart it via IPMI. We grepped the logs trying to find the culprit, but to no avail.Any hint on how to troubleshoot the (eventual) freezes is highly appreciated.Understood on the kernel recommendation. We'll continue to use 4.10, then._______________________________________________On Fri, Oct 27, 2017 at 8:04 PM, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:On Fri, Oct 27, 2017 at 6:33 PM, Bogdan SOLGA <bogdan.solga@xxxxxxxxx> wrote:
> Hello, everyone!
>
> We have recently upgraded our Ceph pool to the latest Luminous release. On
> one of the servers that we used as Ceph clients we had several freeze
> issues, which we empirically linked to the concurrent usage of some I/O
> operations - writing in an LXD container (backed by Ceph) while there was an
> ongoing PG rebalancing. We searched for the issue's cause through the logs,
> but we haven't found anything useful.
What kind of freezes -- temporary slowdowns or hard lockups? Did they
resolve on their own or did you have to intervene?
>
> At that time the server was running Ubuntu 16 with a 4.5 kernel. We thought
> an upgrade to the latest HWE kernel (4.10) would help, but we had the same
> freezing issues after the kernel upgrade. Of course, we're aware that we
> have tried to fix / avoid the issue without understanding it's cause.
>
> After seeing the OS recommendations from the Ceph page, we reinstalled the
> server (and got the 4.4 kernel), we ran into a feature set mismatch issue
> when mounting a RBD image. We concluded that the feature set requires a
> kernel > 4.5.
>
> Our question - how would you recommend us to proceed? Shall we re-upgrade to
> the HWE kernel (4.10) or to another kernel version? Would you recommend an
> alternative solution?
The OS recommendations page lists upstream kernels, as a general
guidance. As long as the kernel is fairly recent and maintained
(either upstream or by the distributor), it should be fine. 4.10 is
certainly better than 4.4-based kernels, at least for the kernel
client.
Thanks,
Ilya
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com