Re: Unable to activate OSD's

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The only clue I have run across so far is that the osd daemons ceph-deploy attempts to create on the failing OSD server (osd3) are two of the same osd-id's just created on the last osd server deployed (osd2). So from the osd tree listing - osd1 has osd.0, osd.1, osd.2 and osd.3. The next server, osd2 has the next 4 in the correct order osd.4, osd.5, osd.6, and osd.7. The failing osd server should have started with osd.8 through osd.11, instead it is reusing osd.5 and osd.6. These are also the only log files in var/log/ceph on osd3 server which contain only the following entry repeated over and over again:

2018-02-07 08:09:33.077286 7f264e6a8800  0 set uid:gid to 167:167 (ceph:ceph)
2018-02-07 08:09:33.077321 7f264e6a8800  0 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe), process ceph-osd, pid 4923
2018-02-07 08:09:33.077572 7f264e6a8800 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-5: (2) No such file or directory


The outputs from list and osd tree follow:

[osd3][DEBUG ] connected to host: osd3
[osd3][DEBUG ] detect platform information from remote host
[osd3][DEBUG ] detect machine type
[osd3][DEBUG ] find the location of an executable
[osd3][INFO  ] Running command: /usr/sbin/ceph-disk list
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] ceph-5
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] Path           /var/lib/ceph/osd/ceph-5
[osd3][INFO  ] ID             5
[osd3][INFO  ] Name           osd.5
[osd3][INFO  ] Status         up
[osd3][INFO  ] Reweight       1.0
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] ceph-6
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] Path           /var/lib/ceph/osd/ceph-6
[osd3][INFO  ] ID             6
[osd3][INFO  ] Name           osd.6
[osd3][INFO  ] Status         up
[osd3][INFO  ] Reweight       1.0
[osd3][INFO  ] ----------------------------------------


[cephuser@groot cephcluster]$ sudo ceph osd tree
ID WEIGHT  TYPE NAME     UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 1.06311 root default                                    
-2 0.53156     host osd1                                   
 0 0.13289         osd.0      up  1.00000          1.00000
 1 0.13289         osd.1      up  1.00000          1.00000
 2 0.13289         osd.2      up  1.00000          1.00000
 3 0.13289         osd.3      up  1.00000          1.00000
-3 0.53156     host osd2                                   
 4 0.13289         osd.4      up  1.00000          1.00000
 5 0.13289         osd.5      up  1.00000          1.00000
 6 0.13289         osd.6      up  1.00000          1.00000
 7 0.13289         osd.7      up  1.00000          1.00000
[cephuser@groot cephcluster]$


From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Андрей <andrey_aha@xxxxxxx>
Sent: Thursday, February 8, 2018 6:40:16 AM
To: ceph-users@xxxxxxxxxxxxxx
Subject: Re: Unable to activate OSD's
 

I have the same problem.
Configuration:
4 HW servers Debian GNU/Linux 9.3 (stretch) 
Ceph luminous 12.2.2

Now I installed on these servers ceph version 10.2.10, OSDs activate is fine.


Среда, 7 февраля 2018, 19:54 +03:00 от "Cranage, Steve" <scranage@xxxxxxxxxxxxxxxxxxxx>:

Greetings ceph-users. I have been trying to build a test cluster in a KVM environment - something I have done before successfully before but this time I'm running into an issue I can't seem to get past. My Internet searches have shown instances of this by other users that involved either ownership problems with the OSD devices, or partition UID's needing to be set. Neither of these problems seem to be in play here.


The cluster is on centos 7, running Ceph 10.2.10. I have configured one mon, and 3 OSD servers with 4 disks each, and each is set to journal on a separate partition of an SSD, one SSD per VM. I have built this VM environment several times now, and recently I always have the same issue on at least one of my VM OSD's and I cannot seem to get any hints of where the problem lies from the sparse information printed to the console during the failure.


In addition to setting partition ownerships to ceph:ceph and UIDs to one of the the values  "set_data_partition" says it expects, I also zeroed out the entire contents of both drives and re-partioned, but I still get the same results. The problem at present only occurs on one virtual server, the other 8 drives split between the other 2 VM OSD's had no issue with prepare or activate. I see no difference between this server or drive configuration vs the other two that run fine.


Hopefully someone can at least point me to some more fruitful log information, "Failed to activate" isn't very helpful by itself. There is nothing in messages other than clean mount/unmount messages for the OSD data device being processed (in this case /dev/vdb1). BTW, I have also tried to repeat the same process without a separate journal device ( just using prepare/activate osd3:/dev/vdb1) and I got the same "Failed to activate" result.



[cephuser@groot cephcluster]$ ceph-deploy osd prepare osd3:/dev/vdb1:/dev/vdf1
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephuser/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.39): /bin/ceph-deploy osd prepare osd3:/dev/vdb1:/dev/vdf1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  block_db                      : None
[ceph_deploy.cli][INFO  ]  disk                          : [('osd3', '/dev/vdb1', '/dev/vdf1')]
[ceph_deploy.cli][INFO  ]  dmcrypt                       : False
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  bluestore                     : None
[ceph_deploy.cli][INFO  ]  block_wal                     : None
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : prepare
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x2a7bdd0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  fs_type                       : xfs
[ceph_deploy.cli][INFO  ]  filestore                     : None
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x2a6f1b8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  zap_disk                      : False
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks osd3:/dev/vdb1:/dev/vdf1
[osd3][DEBUG ] connection detected need for sudo
[osd3][DEBUG ] connected to host: osd3
[osd3][DEBUG ] detect platform information from remote host
[osd3][DEBUG ] detect machine type
[osd3][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to osd3
[osd3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host osd3 disk /dev/vdb1 journal /dev/vdf1 activate False
[osd3][DEBUG ] find the location of an executable
[osd3][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /dev/vdb1 /dev/vdf1
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is /sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is /sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is /sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdf1 uuid path is /sys/dev/block/252:81/dm/uuid
[osd3][WARNIN] prepare_device: Journal /dev/vdf1 is a partition
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdf1 uuid path is /sys/dev/block/252:81/dm/uuid
[osd3][WARNIN] prepare_device: OSD will not be hot-swappable if journal is not the same device as the osd data
[osd3][WARNIN] command: Running command: /sbin/blkid -o udev -p /dev/vdf1
[osd3][WARNIN] prepare_device: Journal /dev/vdf1 was not prepared with ceph-disk. Symlinking directly.
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is /sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] set_data_partition: OSD data device /dev/vdb1 is a partition
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is /sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] command: Running command: /sbin/blkid -o udev -p /dev/vdb1
[osd3][WARNIN] populate_data_path_device: Creating xfs fs on /dev/vdb1
[osd3][WARNIN] command_check_call: Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/vdb1
[osd3][DEBUG ] meta-data="" isize=2048   agcount=4, agsize=8920960 blks
[osd3][DEBUG ]          =                       sectsz=512   attr=2, projid32bit=1
[osd3][DEBUG ]          =                       crc=1        finobt=0, sparse=0
[osd3][DEBUG ] data     =                       bsize=4096   blocks=35683840, imaxpct=25
[osd3][DEBUG ]          =                       sunit=0      swidth=0 blks
[osd3][DEBUG ] naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
[osd3][DEBUG ] log      =internal log           bsize=4096   blocks=17423, version=2
[osd3][DEBUG ]          =                       sectsz=512   sunit=0 blks, lazy-count=1
[osd3][DEBUG ] realtime =none                   extsz=4096   blocks=0, rtextents=0
[osd3][WARNIN] mount: Mounting /dev/vdb1 on /var/lib/ceph/tmp/mnt.EWuVuW with options noatime,inode64
[osd3][WARNIN] command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/vdb1 /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] command: Running command: /sbin/restorecon /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] populate_data_path: Preparing osd data dir /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.EWuVuW/ceph_fsid.7378.tmp
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.EWuVuW/ceph_fsid.7378.tmp
[osd3][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.EWuVuW/fsid.7378.tmp
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.EWuVuW/fsid.7378.tmp
[osd3][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.EWuVuW/magic.7378.tmp
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.EWuVuW/magic.7378.tmp
[osd3][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.EWuVuW/journal_uuid.7378.tmp
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.EWuVuW/journal_uuid.7378.tmp
[osd3][WARNIN] adjust_symlink: Creating symlink /var/lib/ceph/tmp/mnt.EWuVuW/journal -> /dev/vdf1
[osd3][WARNIN] command: Running command: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is /sys/dev/block/252:17/dm/uuid
[osd3][INFO  ] checking OSD status...
[osd3][DEBUG ] find the location of an executable
[osd3][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host osd3 is now ready for osd use.


[cephuser@groot cephcluster]$ ceph-deploy osd activate osd3:/dev/vdb1:/dev/vdf1
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephuser/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.39): /bin/ceph-deploy osd activate osd3:/dev/vdb1:/dev/vdf1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : activate
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x20f9dd0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x20ed1b8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  disk                          : [('osd3', '/dev/vdb1', '/dev/vdf1')]
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks osd3:/dev/vdb1:/dev/vdf1
[osd3][DEBUG ] connection detected need for sudo
[osd3][DEBUG ] connected to host: osd3
[osd3][DEBUG ] detect platform information from remote host
[osd3][DEBUG ] detect machine type
[osd3][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.osd][DEBUG ] activating host osd3 disk /dev/vdb1
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[osd3][DEBUG ] find the location of an executable
[osd3][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/vdb1
[osd3][WARNIN] main_activate: path = /dev/vdb1
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is /sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] command: Running command: /sbin/blkid -o udev -p /dev/vdb1
[osd3][WARNIN] command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/vdb1
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[osd3][WARNIN] mount: Mounting /dev/vdb1 on /var/lib/ceph/tmp/mnt.G7uifc with options noatime,inode64
[osd3][WARNIN] command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/vdb1 /var/lib/ceph/tmp/mnt.G7uifc
[osd3][WARNIN] command: Running command: /sbin/restorecon /var/lib/ceph/tmp/mnt.G7uifc
[osd3][WARNIN] activate: Cluster uuid is 83d61520-5a38-4f50-9b54-bef4f6bef08c
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[osd3][WARNIN] activate: Cluster name is ceph
[osd3][WARNIN] activate: OSD uuid is 4627c861-71b7-485e-a402-30bff54a963c
[osd3][WARNIN] allocate_osd_id: Allocating OSD id...
[osd3][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise 4627c861-71b7-485e-a402-30bff54a963c
[osd3][WARNIN] mount_activate: Failed to activate
[osd3][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.G7uifc
[osd3][WARNIN] command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.G7uifc
[osd3][WARNIN] Traceback (most recent call last):
[osd3][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in <module>
[osd3][WARNIN]     load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5371, in run
[osd3][WARNIN]     main(sys.argv[1:])
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5322, in main
[osd3][WARNIN]     args.func(args)
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3445, in main_activate
[osd3][WARNIN]     reactivate=args.reactivate,
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3202, in mount_activate
[osd3][WARNIN]     (osd_id, cluster) = activate(path, activate_key_template, init)
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3365, in activate
[osd3][WARNIN]     keyring=keyring,
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1013, in allocate_osd_id
[osd3][WARNIN]     raise Error('ceph osd create failed', e, e.output)
[osd3][WARNIN] ceph_disk.main.Error: Error: ceph osd create failed: Command '/usr/bin/ceph' returned non-zero exit status 1: 2018-02-07 09:38:40.104098 7fa479cf2700  0 librados: client.bootstrap-osd authentication error (1) Operation not permitted
[osd3][WARNIN] Error connecting to cluster: PermissionError
[osd3][WARNIN]
[osd3][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/vdb1

[cephuser@groot cephcluster]$




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux