Collection of strange lockups on 0.51

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

This is completely off-list, but I`m asking because only ceph trigger
such a bug :) .

With 0.51, following happens: if I kill an osd, one or more neighbor
nodes may go to hanged state with cpu lockups, not related to
temperature or overall interrupt count or la and it happens randomly
over 16-node cluster. Almost sure that ceph triggerizing some hardware
bug, but I don`t quite sure of which origin. Also after a short time
after reset from such crash a new lockup may be created by any action.

Before blaming system drivers and continuing to investigate a problem,
may I ask if someone faced similar problem? I am using 802.ad on pair
intel 350 for general connectivity. I have attached a bit of traces
which was pushed to netconsole(in some cases, machine died hardly,
e.g. not even sending a final bye over netconsole, so it is not
complete).

Attachment: netcon.log.gz
Description: GNU Zip compressed data


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux