Re: Mysteriously dead OSD process

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi J-P Methot,

perhaps my response is a bit late  but this to some degree recalls me an issue we've been facing yesterday.

First of all you might want to set debug-osd to 20 for this specific OSD and see if log would be more helpful. Please share if possible.

Secondly I'm curious if the last reported PG (2.99s3) is always the same before the crash ? If so you might want to remove it from the OSD using ceph-objectstore-tool's export-remove command - if our case this helped to bring OSD up. Exported PG can be loaded to another OSD or (if that's a single problematic OSD) just thrown away and fixed by scrubbing...


Thanks,

Igor

On 05/04/2023 23:36, J-P Methot wrote:
Hi,


We currently use Ceph Pacific 16.2.10 deployed with Cephadm on this storage cluster. Last night, one of our OSD died. However, since its storage is a SSD, we ran hardware checks and found no issue with the SSD itself. However, if we try starting the service again, the container just crashes 1 second after booting up. If I look at the logs, there's no error. You can see the OSD starting up normally and then the last line before the crash is :

debug 2023-04-05T18:32:57.433+0000 7f8078e0c700  1 osd.87 pg_epoch: 207175 pg[2.99s3( v 207174'218628609 (207134'218623666,207174'218628609] local-lis/les=207140/207141 n=38969 ec=41966/315 lis/c=207140/207049 les/c/f=207141/207050/0 sis=207175 pruub=11.464111328s) [5,228,217,NONE,17,25,167,114,158,178,159]/[5,228,217,87,17,25,167,114,158,178,159]p5(0) r=3 lpr=207175 pi=[207049,207175)/1 crt=207174'218628605 mlcod 0'0 remapped NOTIFY pruub 12054.601562500s@ mbc={}] state<Start>: transitioning to Stray

I don't really see how this line could cause the OSD to crash. Systemd just writes :

Stopping Ceph osd.83 for (uuid)

What could cause this OSD to boot up and then suddenly die? Outside the ceph daemon logs and the systemd logs, is there another way I could gain more information?

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux