On Wed, Nov 15, 2017 at 8:31 AM, Wei Jin <wjin.cn@xxxxxxxxx> wrote: > I tried to do purge/purgedata and then redo the deploy command for a > few times, and it still fails to start osd. > And there is no error log, anyone know what's the problem? Seems like this is OSD 0, right? Have you checked the startup errors on /var/log/ceph/ ? Or by checking the output of the daemon with systemctl? If nothing is working still, maybe try running the OSD in the foreground with (assuming OSD 0): /usr/bin/ceph-osd --debug_osd 20 -d -f --cluster ceph --id 0 --setuser ceph --setgroup ceph Behind the scenes, ceph-disk is getting these devices ready and associated with the cluster as OSD 0, if you've tried this many times already I am suspicious on the same OSD id being used or drives being polluted. Seems like you are using filestore as well, so sdb1 will probably be your data and mounted at /var/lib/ceph/osd/ceph-0 and sdb2 your journal, linked at /var/lib/ceph/osd/ceph-0/journal Make sure those are mounted and linked properly. > BTW, my os is dedian with 4.4 kernel. > Thanks. > > > On Wed, Nov 15, 2017 at 8:24 PM, Wei Jin <wjin.cn@xxxxxxxxx> wrote: >> Hi, List, >> >> My machine has 12 SSDs disk, and I use ceph-deploy to deploy them. But for >> some machine/disks,it failed to start osd. >> I tried many times, some success but others failed. But there is no error >> info. >> Following is ceph-deploy log for one disk: >> >> >> root@n10-075-012:~# ceph-deploy osd create --zap-disk n10-075-094:sdb:sdb >> [ceph_deploy.conf][DEBUG ] found configuration file at: >> /root/.cephdeploy.conf >> [ceph_deploy.cli][INFO ] Invoked (1.5.39): /usr/bin/ceph-deploy osd create >> --zap-disk n10-075-094:sdb:sdb >> [ceph_deploy.cli][INFO ] ceph-deploy options: >> [ceph_deploy.cli][INFO ] username : None >> [ceph_deploy.cli][INFO ] block_db : None >> [ceph_deploy.cli][INFO ] disk : [('n10-075-094', >> '/dev/sdb', '/dev/sdb')] >> [ceph_deploy.cli][INFO ] dmcrypt : False >> [ceph_deploy.cli][INFO ] verbose : False >> [ceph_deploy.cli][INFO ] bluestore : None >> [ceph_deploy.cli][INFO ] block_wal : None >> [ceph_deploy.cli][INFO ] overwrite_conf : False >> [ceph_deploy.cli][INFO ] subcommand : create >> [ceph_deploy.cli][INFO ] dmcrypt_key_dir : >> /etc/ceph/dmcrypt-keys >> [ceph_deploy.cli][INFO ] quiet : False >> [ceph_deploy.cli][INFO ] cd_conf : >> <ceph_deploy.conf.cephdeploy.Conf object at 0x7f566b82a110> >> [ceph_deploy.cli][INFO ] cluster : ceph >> [ceph_deploy.cli][INFO ] fs_type : xfs >> [ceph_deploy.cli][INFO ] filestore : None >> [ceph_deploy.cli][INFO ] func : <function osd at >> 0x7f566ae9a938> >> [ceph_deploy.cli][INFO ] ceph_conf : None >> [ceph_deploy.cli][INFO ] default_release : False >> [ceph_deploy.cli][INFO ] zap_disk : True >> [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks >> n10-075-094:/dev/sdb:/dev/sdb >> [n10-075-094][DEBUG ] connected to host: n10-075-094 >> [n10-075-094][DEBUG ] detect platform information from remote host >> [n10-075-094][DEBUG ] detect machine type >> [n10-075-094][DEBUG ] find the location of an executable >> [ceph_deploy.osd][INFO ] Distro info: debian 8.9 jessie >> [ceph_deploy.osd][DEBUG ] Deploying osd to n10-075-094 >> [n10-075-094][DEBUG ] write cluster configuration to >> /etc/ceph/{cluster}.conf >> [ceph_deploy.osd][DEBUG ] Preparing host n10-075-094 disk /dev/sdb journal >> /dev/sdb activate True >> [n10-075-094][DEBUG ] find the location of an executable >> [n10-075-094][INFO ] Running command: /usr/sbin/ceph-disk -v prepare >> --zap-disk --cluster ceph --fs-type xfs -- /dev/sdb /dev/sdb >> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd >> --cluster=ceph --show-config-value=fsid >> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd >> --check-allows-journal -i 0 --log-file $run_dir/$cluster-osd-check.log >> --cluster ceph --setuser ceph --setgroup ceph >> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd >> --check-wants-journal -i 0 --log-file $run_dir/$cluster-osd-check.log >> --cluster ceph --setuser ceph --setgroup ceph >> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd >> --check-needs-journal -i 0 --log-file $run_dir/$cluster-osd-check.log >> --cluster ceph --setuser ceph --setgroup ceph >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd >> --cluster=ceph --show-config-value=osd_journal_size >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is >> /sys/dev/block/8:17/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb2 uuid path is >> /sys/dev/block/8:18/dm/uuid >> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf >> --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs >> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf >> --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs >> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf >> --cluster=ceph --name=osd. --lookup osd_mount_options_xfs >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] zap: Zapping partition table on /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/sgdisk >> --zap-all -- /dev/sdb >> [n10-075-094][WARNIN] Caution: invalid backup GPT header, but valid main >> header; regenerating >> [n10-075-094][WARNIN] backup header from main header. >> [n10-075-094][WARNIN] >> [n10-075-094][WARNIN] Warning! Main and backup partition tables differ! Use >> the 'c' and 'e' options >> [n10-075-094][WARNIN] on the recovery & transformation menu to examine the >> two tables. >> [n10-075-094][WARNIN] >> [n10-075-094][WARNIN] Warning! One or more CRCs don't match. You should >> repair the disk! >> [n10-075-094][WARNIN] >> [n10-075-094][DEBUG ] >> **************************************************************************** >> [n10-075-094][DEBUG ] Caution: Found protective or hybrid MBR and corrupt >> GPT. Using GPT, but disk >> [n10-075-094][DEBUG ] verification and recovery are STRONGLY recommended. >> [n10-075-094][DEBUG ] >> **************************************************************************** >> [n10-075-094][DEBUG ] GPT data structures destroyed! You may now partition >> the disk using fdisk or >> [n10-075-094][DEBUG ] other utilities. >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/sgdisk >> --clear --mbrtogpt -- /dev/sdb >> [n10-075-094][DEBUG ] Creating new GPT entries. >> [n10-075-094][DEBUG ] The operation has completed successfully. >> [n10-075-094][WARNIN] update_partition: Calling partprobe on zapped device >> /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> settle --timeout=600 >> [n10-075-094][WARNIN] command: Running command: /usr/bin/flock -s /dev/sdb >> /sbin/partprobe /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> settle --timeout=600 >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] ptype_tobe_for_name: name = journal >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] create_partition: Creating journal partition num 2 >> size 40960 on /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/sgdisk >> --new=2:0:+40960M --change-name=2:ceph journal >> --partition-guid=2:b7f01f38-f0d5-45ba-a913-ac7242820aed >> --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sdb >> [n10-075-094][DEBUG ] Setting name! >> [n10-075-094][DEBUG ] partNum is 1 >> [n10-075-094][DEBUG ] REALLY setting name! >> [n10-075-094][DEBUG ] The operation has completed successfully. >> [n10-075-094][WARNIN] update_partition: Calling partprobe on created device >> /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> settle --timeout=600 >> [n10-075-094][WARNIN] command: Running command: /usr/bin/flock -s /dev/sdb >> /sbin/partprobe /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> settle --timeout=600 >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb2 uuid path is >> /sys/dev/block/8:18/dm/uuid >> [n10-075-094][WARNIN] prepare_device: Journal is GPT partition >> /dev/disk/by-partuuid/b7f01f38-f0d5-45ba-a913-ac7242820aed >> [n10-075-094][WARNIN] prepare_device: Journal is GPT partition >> /dev/disk/by-partuuid/b7f01f38-f0d5-45ba-a913-ac7242820aed >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] set_data_partition: Creating osd partition on /dev/sdb >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] ptype_tobe_for_name: name = data >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] create_partition: Creating data partition num 1 size 0 >> on /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/sgdisk >> --largest-new=1 --change-name=1:ceph data >> --partition-guid=1:6e984e11-1b4b-4741-9080-131f13a73daa >> --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be --mbrtogpt -- /dev/sdb >> [n10-075-094][DEBUG ] Setting name! >> [n10-075-094][DEBUG ] partNum is 0 >> [n10-075-094][DEBUG ] REALLY setting name! >> [n10-075-094][DEBUG ] The operation has completed successfully. >> [n10-075-094][WARNIN] update_partition: Calling partprobe on created device >> /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> settle --timeout=600 >> [n10-075-094][WARNIN] command: Running command: /usr/bin/flock -s /dev/sdb >> /sbin/partprobe /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> settle --timeout=600 >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is >> /sys/dev/block/8:17/dm/uuid >> [n10-075-094][WARNIN] populate_data_path_device: Creating xfs fs on >> /dev/sdb1 >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/mkfs -t xfs >> -f -i size=2048 -- /dev/sdb1 >> [n10-075-094][DEBUG ] meta-data=/dev/sdb1 isize=2048 >> agcount=4, agsize=55984277 blks >> [n10-075-094][DEBUG ] = sectsz=4096 attr=2, >> projid32bit=1 >> [n10-075-094][DEBUG ] = crc=0 finobt=0 >> [n10-075-094][DEBUG ] data = bsize=4096 >> blocks=223937105, imaxpct=25 >> [n10-075-094][DEBUG ] = sunit=0 swidth=0 >> blks >> [n10-075-094][DEBUG ] naming =version 2 bsize=4096 >> ascii-ci=0 ftype=0 >> [n10-075-094][DEBUG ] log =internal log bsize=4096 >> blocks=109344, version=2 >> [n10-075-094][DEBUG ] = sectsz=4096 sunit=1 >> blks, lazy-count=1 >> [n10-075-094][DEBUG ] realtime =none extsz=4096 >> blocks=0, rtextents=0 >> [n10-075-094][WARNIN] mount: Mounting /dev/sdb1 on >> /var/lib/ceph/tmp/mnt.N8D5Kd with options >> rw,noexec,noatime,attr2,inode64,logbufs=8,logbsize=256k,noquota >> [n10-075-094][WARNIN] command_check_call: Running command: /bin/mount -t xfs >> -o rw,noexec,noatime,attr2,inode64,logbufs=8,logbsize=256k,noquota -- >> /dev/sdb1 /var/lib/ceph/tmp/mnt.N8D5Kd >> [n10-075-094][WARNIN] populate_data_path: Preparing osd data dir >> /var/lib/ceph/tmp/mnt.N8D5Kd >> [n10-075-094][WARNIN] command: Running command: /bin/chown -R ceph:ceph >> /var/lib/ceph/tmp/mnt.N8D5Kd/ceph_fsid.11531.tmp >> [n10-075-094][WARNIN] command: Running command: /bin/chown -R ceph:ceph >> /var/lib/ceph/tmp/mnt.N8D5Kd/fsid.11531.tmp >> [n10-075-094][WARNIN] command: Running command: /bin/chown -R ceph:ceph >> /var/lib/ceph/tmp/mnt.N8D5Kd/magic.11531.tmp >> [n10-075-094][WARNIN] command: Running command: /bin/chown -R ceph:ceph >> /var/lib/ceph/tmp/mnt.N8D5Kd/journal_uuid.11531.tmp >> [n10-075-094][WARNIN] adjust_symlink: Creating symlink >> /var/lib/ceph/tmp/mnt.N8D5Kd/journal -> >> /dev/disk/by-partuuid/b7f01f38-f0d5-45ba-a913-ac7242820aed >> [n10-075-094][WARNIN] command: Running command: /bin/chown -R ceph:ceph >> /var/lib/ceph/tmp/mnt.N8D5Kd >> [n10-075-094][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.N8D5Kd >> [n10-075-094][WARNIN] command_check_call: Running command: /bin/umount -- >> /var/lib/ceph/tmp/mnt.N8D5Kd >> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is >> /sys/dev/block/8:16/dm/uuid >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/sgdisk >> --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sdb >> [n10-075-094][DEBUG ] Warning: The kernel is still using the old partition >> table. >> [n10-075-094][DEBUG ] The new table will be used at the next reboot. >> [n10-075-094][DEBUG ] The operation has completed successfully. >> [n10-075-094][WARNIN] update_partition: Calling partprobe on prepared device >> /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> settle --timeout=600 >> [n10-075-094][WARNIN] command: Running command: /usr/bin/flock -s /dev/sdb >> /sbin/partprobe /dev/sdb >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> settle --timeout=600 >> [n10-075-094][WARNIN] command_check_call: Running command: /sbin/udevadm >> trigger --action=add --sysname-match sdb1 >> [n10-075-094][INFO ] Running command: systemctl enable ceph.target >> [n10-075-094][INFO ] checking OSD status... >> [n10-075-094][DEBUG ] find the location of an executable >> [n10-075-094][INFO ] Running command: /usr/bin/ceph --cluster=ceph osd stat >> --format=json >> [ceph_deploy.osd][DEBUG ] Host n10-075-094 is now ready for osd use. > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html