Re: ceph-disk from jewel has issues on redhat 7

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 17 Mar 2016 18:47:41 +0100

Hi,

It's true, partprobe works intermittently. I extracted the key
commands to show the problem:

[18:44]# /usr/sbin/sgdisk --new=2:0:20480M --change-name=2:'ceph
journal' --partition-guid=2:aa23e07d-e6b3-4261-a236-c0565971d88d
--typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt --
/dev/sdc
The operation has completed successfully.
[18:44]# partprobe /dev/sdc
Error: Error informing the kernel about modifications to partition
/dev/sdc2 -- Device or resource busy.  This means Linux won't know
about any changes you made to /dev/sdc2 until you reboot -- so you
shouldn't mount it or use it in any way before rebooting.
Error: Failed to add partition 2 (Device or resource busy)
[18:44]# partprobe /dev/sdc
[18:44]# partprobe /dev/sdc
Error: Error informing the kernel about modifications to partition
/dev/sdc2 -- Device or resource busy.  This means Linux won't know
about any changes you made to /dev/sdc2 until you reboot -- so you
shouldn't mount it or use it in any way before rebooting.
Error: Failed to add partition 2 (Device or resource busy)
[18:44]# partprobe /dev/sdc
Error: Error informing the kernel about modifications to partition
/dev/sdc2 -- Device or resource busy.  This means Linux won't know
about any changes you made to /dev/sdc2 until you reboot -- so you
shouldn't mount it or use it in any way before rebooting.
Error: Failed to add partition 2 (Device or resource busy)

But partx works every time:

[18:46]# /usr/sbin/sgdisk --new=2:0:20480M --change-name=2:'ceph
journal' --partition-guid=2:aa23e07d-e6b3-4261-a236-c0565971d88d
--typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt --
/dev/sdd
The operation has completed successfully.
[18:46]# partx -u /dev/sdd
[18:46]# partx -u /dev/sdd
[18:46]# partx -u /dev/sdd
[18:46]#

-- Dan

On Thu, Mar 17, 2016 at 6:31 PM, Vasu Kulkarni <vakulkar@xxxxxxxxxx> wrote:
> I can raise a tracker for this issue since it looks like an intermittent
> issue and mostly dependent on specific hardware or it would be better if you
> add all the hardware/os details in tracker.ceph.com,  also from your logs it
> looks like you have
>  Resource busy issue: Error: Failed to add partition 2 (Device or resource
> busy)
>
>  From my test run logs on centos 7.2 , 10.0.5 (
> http://qa-proxy.ceph.com/teuthology/vasu-2016-03-15_15:34:41-selinux-master---basic-mira/62626/teuthology.log
> )
>
> 2016-03-15T18:49:56.305
> INFO:teuthology.orchestra.run.mira041.stderr:[ceph_deploy.osd][DEBUG ]
> Preparing host mira041 disk /dev/sdb journal None activate True
> 2016-03-15T18:49:56.305
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][DEBUG ] find the
> location of an executable
> 2016-03-15T18:49:56.309
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][INFO  ] Running
> command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs --
> /dev/sdb
> 2016-03-15T18:49:56.546
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
> 2016-03-15T18:49:56.611
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster
> ceph
> 2016-03-15T18:49:56.643
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
> 2016-03-15T18:49:56.708
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
> 2016-03-15T18:49:56.708
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:56.709
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] set_type:
> Will colocate journal with data on /dev/sdb
> 2016-03-15T18:49:56.709
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-osd --cluster=ceph
> --show-config-value=osd_journal_size
> 2016-03-15T18:49:56.774
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:56.774
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:56.775
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:56.775
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup
> osd_mkfs_options_xfs
> 2016-03-15T18:49:56.777
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup
> osd_fs_mkfs_options_xfs
> 2016-03-15T18:49:56.809
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup
> osd_mount_options_xfs
> 2016-03-15T18:49:56.841
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup
> osd_fs_mount_options_xfs
> 2016-03-15T18:49:56.857
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:56.858
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:56.858
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> ptype_tobe_for_name: name = journal
> 2016-03-15T18:49:56.859
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:56.859
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> create_partition: Creating journal partition num 2 size 5120 on /dev/sdb
> 2016-03-15T18:49:56.859
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> command_check_call: Running command: /sbin/sgdisk --new=2:0:+5120M
> --change-name=2:ceph journal
> --partition-guid=2:d4b2fa8d-3f2a-4ce9-a2fe-2a3872d7e198
> --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sdb
> 2016-03-15T18:49:57.927
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][DEBUG ] The operation
> has completed successfully.
> 2016-03-15T18:49:57.927
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> update_partition: Calling partprobe on created device /dev/sdb
> 2016-03-15T18:49:57.928
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
> 2016-03-15T18:49:58.393
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] command:
> Running command: /sbin/partprobe /dev/sdb
> 2016-03-15T18:49:58.393
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
> 2016-03-15T18:49:59.109
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:59.203
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:59.203
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb2 uuid path is /sys/dev/block/8:18/dm/uuid
> 2016-03-15T18:49:59.204
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> prepare_device: Journal is GPT partition
> /dev/disk/by-partuuid/d4b2fa8d-3f2a-4ce9-a2fe-2a3872d7e198
> 2016-03-15T18:49:59.204
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> prepare_device: Journal is GPT partition
> /dev/disk/by-partuuid/d4b2fa8d-3f2a-4ce9-a2fe-2a3872d7e198
> 2016-03-15T18:49:59.204
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:59.205
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> set_data_partition: Creating osd partition on /dev/sdb
> 2016-03-15T18:49:59.205
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:59.205
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> ptype_tobe_for_name: name = data
> 2016-03-15T18:49:59.206
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING] get_dm_uuid:
> get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
> 2016-03-15T18:49:59.206
> INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
> create_partition: Creating data partition num 1 size 0 on /dev/sdb
>
>
>
> On Thu, Mar 17, 2016 at 8:06 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx>
> wrote:
>>
>> Hi,
>>
>> Is there a tracker for this? We just hit the same problem on 10.0.5.
>>
>> Cheers, Dan
>>
>> # rpm -q ceph
>> ceph-10.0.5-0.el7.x86_64
>>
>> # cat /etc/redhat-release
>> CentOS Linux release 7.2.1511 (Core)
>>
>> # ceph-disk -v prepare /dev/sdc
>> DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is
>> /sys/dev/block/8:32/dm/uuid
>> DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is
>> /sys/dev/block/8:32/dm/uuid
>> DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is
>> /sys/dev/block/8:32/dm/uuid
>> INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph
>> --show-config-value=fsid
>> INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
>> --name=osd. --lookup osd_mkfs_type
>> INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
>> --name=osd. --lookup osd_mkfs_options_xfs
>> INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
>> --name=osd. --lookup osd_fs_mkfs_options_xfs
>> INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
>> --name=osd. --lookup osd_mount_options_xfs
>> INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph
>> --show-config-value=osd_journal_size
>> INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
>> --name=osd. --lookup osd_cryptsetup_parameters
>> INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
>> --name=osd. --lookup osd_dmcrypt_key_size
>> INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph
>> --name=osd. --lookup osd_dmcrypt_type
>> DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is
>> /sys/dev/block/8:32/dm/uuid
>> INFO:ceph-disk:Will colocate journal with data on /dev/sdc
>> DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is
>> /sys/dev/block/8:32/dm/uuid
>> DEBUG:ceph-disk:get_dm_uuid /dev/sdc uuid path is
>> /sys/dev/block/8:32/dm/uuid
>> DEBUG:ceph-disk:Creating journal partition num 2 size 20480 on /dev/sdc
>> INFO:ceph-disk:Running command: /usr/sbin/sgdisk --new=2:0:20480M
>> --change-name=2:ceph journal
>> --partition-guid=2:aa23e07d-e6b3-4261-a236-c0565971d88d
>> --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt --
>> /dev/sdc
>> The operation has completed successfully.
>> DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sdc
>> INFO:ceph-disk:Running command: /usr/bin/udevadm settle
>> INFO:ceph-disk:Running command: /usr/sbin/partprobe /dev/sdc
>> Error: Error informing the kernel about modifications to partition
>> /dev/sdc2 -- Device or resource busy.  This means Linux won't know
>> about any changes you made to /dev/sdc2 until you reboot -- so you
>> shouldn't mount it or use it in any way before rebooting.
>> Error: Failed to add partition 2 (Device or resource busy)
>> Traceback (most recent call last):
>>   File "/usr/sbin/ceph-disk", line 3528, in <module>
>>     main(sys.argv[1:])
>>   File "/usr/sbin/ceph-disk", line 3482, in main
>>     args.func(args)
>>   File "/usr/sbin/ceph-disk", line 1817, in main_prepare
>>     luks=luks
>>   File "/usr/sbin/ceph-disk", line 1447, in prepare_journal
>>     return prepare_journal_dev(data, journal, journal_size,
>> journal_uuid, journal_dm_keypath, cryptsetup_parameters, luks)
>>   File "/usr/sbin/ceph-disk", line 1401, in prepare_journal_dev
>>     raise Error(e)
>> __main__.Error: Error: Command '['/usr/sbin/partprobe', '/dev/sdc']'
>> returned non-zero exit status 1
>>
>> On Tue, Mar 15, 2016 at 8:38 PM, Vasu Kulkarni <vakulkar@xxxxxxxxxx>
>> wrote:
>> > Thanks for the steps that should be enough to test it out, I hope you
>> > got
>> > the latest ceph-deploy either from pip or throught github.
>> >
>> > On Tue, Mar 15, 2016 at 12:29 PM, Stephen Lord <Steve.Lord@xxxxxxxxxxx>
>> > wrote:
>> >>
>> >> I would have to nuke my cluster right now, and I do not have a spare
>> >> one..
>> >>
>> >> The procedure though is literally this, given a 3 node redhat 7.2
>> >> cluster,
>> >> ceph00, ceph01 and ceph02
>> >>
>> >> ceph-deploy install --testing ceph00 ceph01 ceph02
>> >> ceph-deploy new ceph00 ceph01 ceph02
>> >>
>> >> ceph-deploy mon create  ceph00 ceph01 ceph02
>> >> ceph-deploy gatherkeys  ceph00
>> >>
>> >> ceph-deploy osd create ceph00:sdb:/dev/sdi
>> >> ceph-deploy osd create ceph00:sdc:/dev/sdi
>> >>
>> >> All devices have their partition tables wiped before this. They are all
>> >> just SATA devices, no special devices in the way.
>> >>
>> >> sdi is an ssd and it is being carved up for journals. The first osd
>> >> create
>> >> works, the second one gets stuck in a loop in the update_partition call
>> >> in
>> >> ceph_disk for the 5 iterations before it gives up. When I look in
>> >> /sys/block/sdi the partition for the first osd is visible, the one for
>> >> the
>> >> second is not. However looking at /proc/partitions it sees the correct
>> >> thing. So something about partprobe is not kicking udev into doing the
>> >> right
>> >> thing when the second partition is added I suspect.
>> >>
>> >> If I do not use the separate journal device then it usually works, but
>> >> occasionally I see a single retry in that same loop.
>> >>
>> >> There is code in ceph_deploy which uses partprobe or partx depending on
>> >> which distro it detects, that is how I worked out what to change here.
>> >>
>> >> If I have to tear things down again I will reproduce and post here.
>> >>
>> >> Steve
>> >>
>> >> > On Mar 15, 2016, at 2:12 PM, Vasu Kulkarni <vakulkar@xxxxxxxxxx>
>> >> > wrote:
>> >> >
>> >> > Do you mind giving the full failed logs somewhere in fpaste.org along
>> >> > with some os version details?
>> >> >  There are some known issues on RHEL,  If you use 'osd prepare' and
>> >> > 'osd
>> >> > activate'(specifying just the journal partition here) it might work
>> >> > better.
>> >> >
>> >> > On Tue, Mar 15, 2016 at 12:05 PM, Stephen Lord
>> >> > <Steve.Lord@xxxxxxxxxxx>
>> >> > wrote:
>> >> > Not multipath if you mean using the multipath driver, just trying to
>> >> > setup OSDs which use a data disk and a journal ssd. If I run just a
>> >> > disk
>> >> > based OSD and only specify one device to ceph-deploy then it usually
>> >> > works
>> >> > although sometimes has to retry. In the case where I am using it to
>> >> > carve an
>> >> > SSD into several partitions for journals it fails on the second one.
>> >> >
>> >> > Steve
>> >> >
>> >>
>> >>
>> >> ----------------------------------------------------------------------
>> >> The information contained in this transmission may be confidential. Any
>> >> disclosure, copying, or further distribution of confidential
>> >> information is
>> >> not permitted unless such privilege is explicitly granted in writing by
>> >> Quantum. Quantum reserves the right to have electronic communications,
>> >> including email and attachments, sent across its networks filtered
>> >> through
>> >> anti virus and spam software programs and retain such messages in order
>> >> to
>> >> comply with applicable data security and retention requirements.
>> >> Quantum is
>> >> not responsible for the proper and complete transmission of the
>> >> substance of
>> >> this communication or for any delay in its receipt.
>> >
>> >
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com