Re: ceph-disk activate-block: not a block device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

 

We are seeing the same problem here at Ruthford Appleton Laboratory:

 

During our patching against Stack Clash on our large physics data cluster, when rebooting the storage nodes about 8/36 OSD disks remount. We coaxed them to mount manually during the reboot campaign (see method below) but obviously we want a more long-term solution.

 

I believe this problem occurs as many of the OSD daemons are being started before the OSD disk is mounted.

 

From: [ceph-users] erratic startup of OSDs at reboot time, 2017-07-12, Graham Allan

We tried running: “udevadm trigger --subsystem-match=block --action="" with occasional success but this wasn’t reliable.

 

From: [ceph-users] CentOS7 Mounting Problem, 2017-04-10, Jake Young

Interesting that running partprobe causes the OSD disk to mount and the OSD to start automatically. However, I don’t know why this would fix the problem for subsequent reboots.

 

Note: Interestingly, I had one example of this model of storage node (36 OSDs per host) in our development cluster (78 OSDs), over 5 reboots, all OSD disks mounted and the OSD processes started, so I am unable to reproduce the problem at small scale.

 

Best wishes,

Bruno

 

--------

 

Cluster:

5 MONs

1404 OSDs

39 storage nodes, 36 OSD disks per node connected to PCI via HBA

 

Software:

OS: SL7x

Ceph Release: kraken

Ceph Version: 11.2.0-0

Ceph Deploy Release: kraken

Ceph Deploy Version: 1.5.37-0

 

OSDs created as follows:

ceph-deploy disk zap $sn_fqdn:sdb

ceph-deploy --overwrite-conf config pull $sn_fqdn

ceph-deploy osd prepare $sn_fqdn:sdb

 

Coaxing method:

for srv in $(systemctl list-units -t service --full --no-pager -n0 | grep ceph-disk | awk '"'"'{print $2}'"'"'); do

    echo "Starting $srv" | ts

    systemctl start $srv

    sleep 1

done

 

 

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Willem Jan Withagen
Sent: 20 July 2017 19:06
To: Roger Brown; ceph-users
Subject: Re: [ceph-users] ceph-disk activate-block: not a block device

 

Hi Roger,

Device detection has recently changed (because FreeBSD does not have blockdevices).
So could very well be that this is an actual problem where something is still wrong.
Please keep an eye out, and let me know if it comes back.

--WjW

Op 20-7-2017 om 19:29 schreef Roger Brown:

So I disabled ceph-disk and will chalk it up as a red herring to ignore.

 

 

On Thu, Jul 20, 2017 at 11:02 AM Roger Brown <rogerpbrown@xxxxxxxxx> wrote:

Also I'm just noticing osd1 is my only OSD host that even has an enabled target for ceph-disk (ceph-disk@dev-sdb2.service).

 

roger@osd1:~$ systemctl list-units ceph*

  UNIT                       LOAD   ACTIVE SUB     DESCRIPTION

ceph-disk@dev-sdb2.service loaded failed failed  Ceph disk activation: /dev/sdb2

  ceph-osd@3.service         loaded active running Ceph object storage daemon osd.3

  ceph-mds.target            loaded active active  ceph target allowing to start/stop all ceph-mds@.service instances at once

  ceph-mgr.target            loaded active active  ceph target allowing to start/stop all ceph-mgr@.service instances at once

  ceph-mon.target            loaded active active  ceph target allowing to start/stop all ceph-mon@.service instances at once

  ceph-osd.target            loaded active active  ceph target allowing to start/stop all ceph-osd@.service instances at once

  ceph-radosgw.target        loaded active active  ceph target allowing to start/stop all ceph-radosgw@.service instances at once

  ceph.target                loaded active active  ceph target allowing to start/stop all ceph*@.service instances at once

 

roger@osd2:~$ systemctl list-units ceph*

UNIT                LOAD   ACTIVE SUB     DESCRIPTION

ceph-osd@4.service  loaded active running Ceph object storage daemon osd.4

ceph-mds.target     loaded active active  ceph target allowing to start/stop all ceph-mds@.service instances at once

ceph-mgr.target     loaded active active  ceph target allowing to start/stop all ceph-mgr@.service instances at once

ceph-mon.target     loaded active active  ceph target allowing to start/stop all ceph-mon@.service instances at once

ceph-osd.target     loaded active active  ceph target allowing to start/stop all ceph-osd@.service instances at once

ceph-radosgw.target loaded active active  ceph target allowing to start/stop all ceph-radosgw@.service instances at once

ceph.target         loaded active active  ceph target allowing to start/stop all ceph*@.service instances at once

 

roger@osd3:~$ systemctl list-units ceph*

UNIT                LOAD   ACTIVE SUB     DESCRIPTION

ceph-osd@0.service  loaded active running Ceph object storage daemon osd.0

ceph-mds.target     loaded active active  ceph target allowing to start/stop all ceph-mds@.service instances at once

ceph-mgr.target     loaded active active  ceph target allowing to start/stop all ceph-mgr@.service instances at once

ceph-mon.target     loaded active active  ceph target allowing to start/stop all ceph-mon@.service instances at once

ceph-osd.target     loaded active active  ceph target allowing to start/stop all ceph-osd@.service instances at once

ceph-radosgw.target loaded active active  ceph target allowing to start/stop all ceph-radosgw@.service instances at once

ceph.target         loaded active active  ceph target allowing to start/stop all ceph*@.service instances at once

 

 

On Thu, Jul 20, 2017 at 10:23 AM Roger Brown <rogerpbrown@xxxxxxxxx> wrote:

I think I need help with some OSD trouble. OSD daemons on two hosts started flapping. At length, I rebooted host osd1 (osd.3), but the OSD daemon still fails to start. Upon closer inspection, ceph-disk@dev-sdb2.service is failing to start due to, "Error: /dev/sdb2 is not a block device"

 

This is the command I see it failing to run: 

 

roger@osd1:~$ sudo /usr/sbin/ceph-disk --verbose activate-block /dev/sdb2

Traceback (most recent call last):

  File "/usr/sbin/ceph-disk", line 9, in <module>

    load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()

  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5731, in run

    main(sys.argv[1:])

  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5682, in main

    args.func(args)

  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5438, in <lambda>

    func=lambda args: main_activate_space(name, args),

  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4160, in main_activate_space

    osd_uuid = get_space_osd_uuid(name, dev)

  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4115, in get_space_osd_uuid

    raise Error('%s is not a block device' % path)

ceph_disk.main.Error: Error: /dev/sdb2 is not a block device

 

osd1 environment:

$ ceph -v

ceph version 12.1.1 (f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)

$ uname -r

4.4.0-83-generic

$ lsb_release -sc

xenial

 

Please advise.

 




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux