v11.2.0 Disk activation issue while booting

nokia ceph <nokiacephusers@xxxxxxxxx> · Tue, 13 Jun 2017 16:02:22 +0530

Hello,
Some osd's not getting activated after a reboot operation which cause that particular osd's landing in failed state. 

Here you can see mount points were not getting updated to osd-num and mounted as a incorrect mount point, which caused osd.<num> can't able to mount/activate the osd's.

Env:- RHEL 7.2 - EC 4+1, v11.2.0 bluestore.

#grep mnt proc/mounts
/dev/sdh1 /var/lib/ceph/tmp/mnt.om4Lbq xfs rw,noatime,attr2,inode64,sunit=512,swidth=512,noquota 0 0
/dev/sdh1 /var/lib/ceph/tmp/mnt.EayTmL xfs rw,noatime,attr2,inode64,sunit=512,swidth=512,noquota 0 0

From /var/log/messages.. 

--
May 26 15:39:58 cn1 systemd: Starting Ceph disk activation: /dev/sdh2...
May 26 15:39:58 cn1 systemd: Starting Ceph disk activation: /dev/sdh1...

May 26 15:39:58 cn1 systemd: start request repeated too quickly for ceph-disk@dev-sdh2.service   => suspecting this could be root cause. 
May 26 15:39:58 cn1 systemd: Failed to start Ceph disk activation: /dev/sdh2.
May 26 15:39:58 cn1 systemd: Unit ceph-disk@dev-sdh2.service entered failed state.
May 26 15:39:58 cn1 systemd: ceph-disk@dev-sdh2.service failed.
May 26 15:39:58 cn1 systemd: start request repeated too quickly for ceph-disk@dev-sdh1.service
May 26 15:39:58 cn1 systemd: Failed to start Ceph disk activation: /dev/sdh1.
May 26 15:39:58 cn1 systemd: Unit ceph-disk@dev-sdh1.service entered failed state.
May 26 15:39:58 cn1 systemd: ceph-disk@dev-sdh1.service failed.
--

But this issue will occur intermittently  after a reboot operation. 

Note;- We haven't face this problem in Jewel.

Awaiting for comments. 

Thanks
Jayaram
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com