On Sun, Jul 17, 2016 at 10:01 PM, Will Dennis <willard.dennis@xxxxxxxxx> wrote: > > On Jul 17, 2016, at 7:05 AM, Ruben Kerkhof <ruben@xxxxxxxxxxxxxxxx> wrote: > > First, there's an issue with the version of parted in CentOS 7.2: > https://bugzilla.redhat.com/1339705 > > > Saw this sort of thing: > > [ceph2][WARNIN] update_partition: Calling partprobe on created device > /dev/sde > [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle > --timeout=600 > [ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde > [ceph2][WARNIN] update_partition: partprobe /dev/sde failed : Error: Error > informing the kernel about modifications to partition /dev/sde1 -- Device or > resource busy. This means Linux won't know about any changes you made to > /dev/sde1 until you reboot -- so you shouldn't mount it or use it in any way > before rebooting. > [ceph2][WARNIN] Error: Failed to add partition 1 (Device or resource busy) > [ceph2][WARNIN] (ignored, waiting 60s) > [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle > --timeout=600 > [ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde > [ceph2][WARNIN] update_partition: partprobe /dev/sde failed : Error: Error > informing the kernel about modifications to partition /dev/sde1 -- Device or > resource busy. This means Linux won't know about any changes you made to > /dev/sde1 until you reboot -- so you shouldn't mount it or use it in any way > before rebooting. > [ceph2][WARNIN] Error: Failed to add partition 1 (Device or resource busy) > [ceph2][WARNIN] (ignored, waiting 60s) > [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle > --timeout=600 > [ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde > [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle > --timeout=600 > [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is > /sys/dev/block/8:64/dm/uuid > [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is > /sys/dev/block/8:64/dm/uuid > [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde1 uuid path is > /sys/dev/block/8:65/dm/uuid > [ceph2][WARNIN] populate_data_path_device: Creating xfs fs on /dev/sde1 > > Is this because of the aforementioned bug? It seemed to succeed after a few > retries in each case of it happening. It is, yes. Most of the time it succeeds after retrying, but I've seen it fail too. > > Secondly, the disks are now activated by udev. Instead of using > activate, use prepare > and udev handles the rest. > > > I saw this sort of thing after each disk prepare: > > [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdc uuid path is > /sys/dev/block/8:32/dm/uuid > [ceph2][WARNIN] command_check_call: Running command: /sbin/sgdisk > --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sdc > [ceph2][DEBUG ] Warning: The kernel is still using the old partition table. > [ceph2][DEBUG ] The new table will be used at the next reboot. > [ceph2][DEBUG ] The operation has completed successfully. > [ceph2][WARNIN] update_partition: Calling partprobe on prepared device > /dev/sdc > [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle > --timeout=600 > [ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sdc > [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle > --timeout=600 > [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm > trigger --action=add --sysname-match sdc1 > [ceph2][INFO ] checking OSD status... > [ceph2][DEBUG ] find the location of an executable > [ceph2][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat > --format=json > [ceph_deploy.osd][DEBUG ] Host ceph2 is now ready for osd use. > > Is the ‘udevadm’ stuff I see there what you are talking about? How may I > verify that the disks are activated & ready for use? Yes, that's it. You should see osd processes running, and the osd's should be marked 'up' when you run 'ceph osd tree'. > > > Third, this doesn't work well if you're also using LVM on your host > since for some reason > this causes udev to not send the necessary add/change events. > > > Not using LVM on these hosts, but good to know. Just thought of a fourth issue, please make sure your disks are absolutely empty! I reused disks that I used previously for zfs, and zfs leaves metadata behind at the end of the disk. This confuses blkid greatly (and me too). ceph-disk prepare --zap is not enough to resolve this. I've stuck the following in my kickstart file which I use to prepare my OSD servers. %pre #!/bin/bash for disk in $(ls -1 /dev/sd* | awk '/[a-z]$/ {print}'); do test -b "$disk" || continue size_in_bytes=$(blockdev --getsize64 ${disk}) offset=$((size_in_bytes - 8 * 1024 * 1024)) echo "Wiping ${disk}" # wipe start dd if=/dev/zero of=${disk} bs=1M count=8 status=none # wipe end dd if=/dev/zero of=${disk} bs=1M count=8 seek=${offset} oflag=seek_bytes status=none done %end Kind regards, Ruben _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com