Re: ceph-disk from jewel has issues on redhat 7

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 17 Mar 2016 16:06:28 +0100

Hi,

Is there a tracker for this? We just hit the same problem on 10.0.5.

Cheers, Dan

# rpm -q ceph
ceph-10.0.5-0.el7.x86_64

# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

# ceph-disk -v prepare /dev/sdc
DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is /sys/dev/block/8:32/dm/uuid
DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is /sys/dev/block/8:32/dm/uuid
DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is /sys/dev/block/8:32/dm/uuid
INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph
--show-config-value=fsid
INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_mkfs_type
INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_mkfs_options_xfs
INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_fs_mkfs_options_xfs
INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_mount_options_xfs
INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph
--show-config-value=osd_journal_size
INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_cryptsetup_parameters
INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_dmcrypt_key_size
INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_dmcrypt_type
DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is /sys/dev/block/8:32/dm/uuid
INFO:ceph-disk:Will colocate journal with data on /dev/sdc
DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is /sys/dev/block/8:32/dm/uuid
DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is /sys/dev/block/8:32/dm/uuid
DEBUG:ceph-disk:Creating journal partition num 2 size 20480 on /dev/sdc
INFO:ceph-disk:Running command: /usr/sbin/sgdisk --new=2:0:20480M
--change-name=2:ceph journal
--partition-guid=2:aa23e07d-e6b3-4261-a236-c0565971d88d
--typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt --
/dev/sdc
The operation has completed successfully.
DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sdc
INFO:ceph-disk:Running command: /usr/bin/udevadm settle
INFO:ceph-disk:Running command: /usr/sbin/partprobe /dev/sdc
Error: Error informing the kernel about modifications to partition
/dev/sdc2 -- Device or resource busy.  This means Linux won't know
about any changes you made to /dev/sdc2 until you reboot -- so you
shouldn't mount it or use it in any way before rebooting.
Error: Failed to add partition 2 (Device or resource busy)
Traceback (most recent call last):
  File "/usr/sbin/ceph-disk", line 3528, in <module>
    main(sys.argv[1:])
  File "/usr/sbin/ceph-disk", line 3482, in main
    args.func(args)
  File "/usr/sbin/ceph-disk", line 1817, in main_prepare
    luks=luks
  File "/usr/sbin/ceph-disk", line 1447, in prepare_journal
    return prepare_journal_dev(data, journal, journal_size,
journal_uuid, journal_dm_keypath, cryptsetup_parameters, luks)
  File "/usr/sbin/ceph-disk", line 1401, in prepare_journal_dev
    raise Error(e)
__main__.Error: Error: Command '['/usr/sbin/partprobe', '/dev/sdc']'
returned non-zero exit status 1

On Tue, Mar 15, 2016 at 8:38 PM, Vasu Kulkarni <vakulkar@xxxxxxxxxx> wrote:
> Thanks for the steps that should be enough to test it out, I hope you got
> the latest ceph-deploy either from pip or throught github.
>
> On Tue, Mar 15, 2016 at 12:29 PM, Stephen Lord <Steve.Lord@xxxxxxxxxxx>
> wrote:
>>
>> I would have to nuke my cluster right now, and I do not have a spare one..
>>
>> The procedure though is literally this, given a 3 node redhat 7.2 cluster,
>> ceph00, ceph01 and ceph02
>>
>> ceph-deploy install --testing ceph00 ceph01 ceph02
>> ceph-deploy new ceph00 ceph01 ceph02
>>
>> ceph-deploy mon create  ceph00 ceph01 ceph02
>> ceph-deploy gatherkeys  ceph00
>>
>> ceph-deploy osd create ceph00:sdb:/dev/sdi
>> ceph-deploy osd create ceph00:sdc:/dev/sdi
>>
>> All devices have their partition tables wiped before this. They are all
>> just SATA devices, no special devices in the way.
>>
>> sdi is an ssd and it is being carved up for journals. The first osd create
>> works, the second one gets stuck in a loop in the update_partition call in
>> ceph_disk for the 5 iterations before it gives up. When I look in
>> /sys/block/sdi the partition for the first osd is visible, the one for the
>> second is not. However looking at /proc/partitions it sees the correct
>> thing. So something about partprobe is not kicking udev into doing the right
>> thing when the second partition is added I suspect.
>>
>> If I do not use the separate journal device then it usually works, but
>> occasionally I see a single retry in that same loop.
>>
>> There is code in ceph_deploy which uses partprobe or partx depending on
>> which distro it detects, that is how I worked out what to change here.
>>
>> If I have to tear things down again I will reproduce and post here.
>>
>> Steve
>>
>> > On Mar 15, 2016, at 2:12 PM, Vasu Kulkarni <vakulkar@xxxxxxxxxx> wrote:
>> >
>> > Do you mind giving the full failed logs somewhere in fpaste.org along
>> > with some os version details?
>> >  There are some known issues on RHEL,  If you use 'osd prepare' and 'osd
>> > activate'(specifying just the journal partition here) it might work better.
>> >
>> > On Tue, Mar 15, 2016 at 12:05 PM, Stephen Lord <Steve.Lord@xxxxxxxxxxx>
>> > wrote:
>> > Not multipath if you mean using the multipath driver, just trying to
>> > setup OSDs which use a data disk and a journal ssd. If I run just a disk
>> > based OSD and only specify one device to ceph-deploy then it usually works
>> > although sometimes has to retry. In the case where I am using it to carve an
>> > SSD into several partitions for journals it fails on the second one.
>> >
>> > Steve
>> >
>>
>>
>> ----------------------------------------------------------------------
>> The information contained in this transmission may be confidential. Any
>> disclosure, copying, or further distribution of confidential information is
>> not permitted unless such privilege is explicitly granted in writing by
>> Quantum. Quantum reserves the right to have electronic communications,
>> including email and attachments, sent across its networks filtered through
>> anti virus and spam software programs and retain such messages in order to
>> comply with applicable data security and retention requirements. Quantum is
>> not responsible for the proper and complete transmission of the substance of
>> this communication or for any delay in its receipt.
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com