Re: OSD Segfaults after Bluestore conversion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Did you solve this?  Similar issue.
_____________________________________________


On Wed, Feb 28, 2018 at 3:46 PM Kyle Hutson <kylehutson@xxxxxxx> wrote:
I'm following up from awhile ago. I don't think this is the same bug. The bug referenced shows "abort: Corruption: block checksum mismatch", and I'm not seeing that on mine.

Now I've had 8 OSDs down on this one server for a couple of weeks, and I just tried to start it back up. Here's a link to the log of that OSD (which segfaulted right after starting up): http://people.beocat.ksu.edu/~kylehutson/ceph-osd.414.log

To me, it looks like the logs are providing surprisingly few hints as to where the problem lies. Is there a way I can turn up logging to see if I can get any more info as to why this is happening?

On Thu, Feb 8, 2018 at 3:02 AM, Mike O'Connor <mike@xxxxxxxxxx> wrote:
On 7/02/2018 8:23 AM, Kyle Hutson wrote:
> We had a 26-node production ceph cluster which we upgraded to Luminous
> a little over a month ago. I added a 27th-node with Bluestore and
> didn't have any issues, so I began converting the others, one at a
> time. The first two went off pretty smoothly, but the 3rd is doing
> something strange.
>
> Initially, all the OSDs came up fine, but then some started to
> segfault. Out of curiosity more than anything else, I did reboot the
> server to see if it would get better or worse, and it pretty much
> stayed the same - 12 of the 18 OSDs did not properly come up. Of
> those, 3 again segfaulted
>
> I picked one that didn't properly come up and copied the log to where
> anybody can view it:
> http://people.beocat.ksu.edu/~kylehutson/ceph-osd.426.log
> <http://people.beocat.ksu.edu/%7Ekylehutson/ceph-osd.426.log>
>
> You can contrast that with one that is up:
> http://people.beocat.ksu.edu/~kylehutson/ceph-osd.428.log
> <http://people.beocat.ksu.edu/%7Ekylehutson/ceph-osd.428.log>
>
> (which is still showing segfaults in the logs, but seems to be
> recovering from them OK?)
>
> Any ideas?
Ideas ? yes

There is a a bug which is hitting a small number of systems and at this
time there is no solution. Issues details at
http://tracker.ceph.com/issues/22102.

Please submit more details of your problem on the ticket.

Mike


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux