Re: ceph-dis prepare : UUID=00000000-0000-0000-0000-000000000000

Loic Dachary <loic@xxxxxxxxxxx> · Sat, 11 Oct 2014 18:26:07 +0200

Hi Frederic,

I think the problem is that you have a package that has this bug http://tracker.ceph.com/issues/9747

Let me know if using the latest from EPEL ( i.e. what was created from https://dl.fedoraproject.org/pub/epel/7/SRPMS/c/ceph-0.80.5-8.el7.src.rpm ) solves the problem. I'm learning a lot about udev, centos and RHEL in the process ;-)

Cheers

On 11/10/2014 17:31, Loic Dachary wrote:
> Hi,
> 
> On RHEL7 it works as expected : 
> 
> [ubuntu@mira042 ~]$ sudo ceph-disk prepare /dev/sdg
> 
> ***************************************************************
> Found invalid GPT and valid MBR; converting MBR to GPT format.
> ***************************************************************
> 
> Information: Moved requested sector from 34 to 2048 in
> order to align on 2048-sector boundaries.
> The operation has completed successfully.
> partx: /dev/sdg: error adding partition 2
> Information: Moved requested sector from 10485761 to 10487808 in
> order to align on 2048-sector boundaries.
> The operation has completed successfully.
> meta-data=/dev/sdg1              isize=2048   agcount=4, agsize=60719917 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=0
> data     =                       bsize=4096   blocks=242879665, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
> log      =internal log           bsize=4096   blocks=118593, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> The operation has completed successfully.
> partx: /dev/sdg: error adding partitions 1-2
> [ubuntu@mira042 ~]$ df
> Filesystem     1K-blocks    Used Available Use% Mounted on
> /dev/sda1      974540108 3412868 971110856   1% /
> devtmpfs         8110996       0   8110996   0% /dev
> tmpfs            8130388       0   8130388   0% /dev/shm
> tmpfs            8130388   58188   8072200   1% /run
> tmpfs            8130388       0   8130388   0% /sys/fs/cgroup
> /dev/sdh1      971044288   34740 971009548   1% /var/lib/ceph/osd/ceph-0
> /dev/sdg1      971044288   33700 971010588   1% /var/lib/ceph/osd/ceph-1
> 
> There is an important difference though: RHEL7 does not use https://github.com/ceph/ceph/blob/giant/src/ceph-disk-udev . It should not be necessary for centos7 but it looks like it is in use since the debug you get comes from it . There must be something wrong in the source package you are using around this point https://github.com/ceph/ceph/blob/giant/ceph.spec.in#L382
> 
> I checked http://ftp.redhat.com/pub/redhat/linux/enterprise/7Server/en/RHOS/SRPMS/ceph-0.80.5-1.el7ost.src.rpm and it is as expected. Could you let me know where you got the package from ? And what is the version according to 
> 
> $ yum info ceph
> Installed Packages
> Name        : ceph
> Arch        : x86_64
> Epoch       : 1
> Version     : 0.80.5
> Release     : 8.el7
> Size        : 37 M
> Repo        : installed
> From repo   : epel
> Summary     : User space components of the Ceph file system
> URL         : http://ceph.com/
> License     : GPL-2.0
> Description : Ceph is a massively scalable, open-source, distributed
>             : storage system that runs on commodity hardware and delivers
>             : object, block and file system storage.
> 
> I'm not very familiar with RPMs and maybe Release     : 8.el7 means it is a more recent version of the package : ceph-0.80.5-1.el7ost.src.rpm suggests the Release should be 1.el7 and not 8.el7.
> 
> Cheers
> 
> On 10/10/2014 15:59, SCHAER Frederic wrote:
>> Hi Loic,
>>
>> Patched, and still not working (sorry)...
>> I'm attaching the prepare output, and also a different a "real " udev debug output I captured using " udevadm monitor --environment " (udev.log file)
>>
>> I added a "sync" command in ceph-disk-udev (this did not change a thing), and I noticed that udev script is called 3 times when adding one disk, and that the debug output was captured and then mixed all into one file.
>> This may lead to log mis-interpretation (race conditions ?)...
>> I changed a bit the logging in order to get one file per call and attached those logs to this mail.
>>
>> File timestamps are as follows :
>>   File: '/var/log/udev_ceph.log.out.22706'
>> Change: 2014-10-10 15:48:09.136386306 +0200
>>   File: '/var/log/udev_ceph.log.out.22749'
>> Change: 2014-10-10 15:48:11.502425395 +0200
>>   File: '/var/log/udev_ceph.log.out.22750'
>> Change: 2014-10-10 15:48:11.606427113 +0200
>>
>> Actually, I can reproduce the UUID=0 thing with this command :
>>
>> [root@ceph1 ~]# /usr/sbin/ceph-disk -v activate-journal /dev/sdc2
>> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sdc2
>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> DEBUG:ceph-disk:Journal /dev/sdc2 has OSD UUID 00000000-0000-0000-0000-000000000000
>> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000
>> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory
>> ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2
>>
>> Ah - to answer previous mails :
>> - I tried to manually create the gpt partition table to see if things would improve, but this was not the case (I also tried to zero out the start and end of disks, and also to add random data)
>> - running ceph-disk prepare twice does not work, it's just that once every 20 (?) times it "surprisingly does not fail" on this hardware/os combination ;)
>>
>> Regards
>>
>> -----Message d'origine-----
>> De : Loic Dachary [mailto:loic@xxxxxxxxxxx] 
>> Envoyé : vendredi 10 octobre 2014 14:37
>> À : SCHAER Frederic; ceph-users@xxxxxxxxxxxxxx
>> Objet : Re:  ceph-dis prepare : UUID=00000000-0000-0000-0000-000000000000
>>
>> Hi Frederic,
>>
>> To be 100% sure it would be great if you could manually patch your local ceph-disk script and change 'partprobe', into 'partx', '-a', in https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284
>>
>> ceph-disk zap
>> ceph-disk prepare
>>
>> and hopefully it will show up as it should. It works for me on centos7 but ...
>>
>> Cheers
>>
>> On 10/10/2014 14:33, Loic Dachary wrote:
>>> Hi Frederic,
>>>
>>> It looks like this is just because https://github.com/ceph/ceph/blob/v0.80.6/src/ceph-disk#L1284 should call partx instead of partprobe. The udev debug output makes this quite clear http://tracker.ceph.com/issues/9721
>>>
>>> I think https://github.com/dachary/ceph/commit/8d914001420e5bfc1e12df2d4882bfe2e1719a5c#diff-788c3cea6213c27f5fdb22f8337096d5R1285 fixes it
>>>
>>> Cheers
>>>
>>> On 09/10/2014 16:29, SCHAER Frederic wrote:
>>>>
>>>>
>>>> -----Message d'origine-----
>>>> De : Loic Dachary [mailto:loic@xxxxxxxxxxx] 
>>>> Envoyé : jeudi 9 octobre 2014 16:20
>>>> À : SCHAER Frederic; ceph-users@xxxxxxxxxxxxxx
>>>> Objet : Re:  ceph-dis prepare : UUID=00000000-0000-0000-0000-000000000000
>>>>
>>>>
>>>>
>>>> On 09/10/2014 16:04, SCHAER Frederic wrote:
>>>>> Hi Loic,
>>>>>
>>>>> Back on sdb, as the sde output was from another machine on which I ran partx -u afterwards.
>>>>> To reply your last question first : I think the SG_IO error comes from the fact that disks are exported as a single disks RAID0 on a PERC 6/E, which does not support JBOD - this is decommissioned hardware on which I'd like to test and validate we can use ceph for our use case...
>>>>>
>>>>> So back on the  UUID.
>>>>> It's funny : I retried and ceph-disk prepare worked this time. I tried on another disk, and it failed.
>>>>> There is a difference in the output from ceph-disk : on the failing disk, I have these extra lines after disks are prepared :
>>>>>
>>>>> (...)
>>>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>>>> Warning: The kernel is still using the old partition table.
>>>>> The new table will be used at the next reboot.
>>>>> The operation has completed successfully.
>>>>> partx: /dev/sdc: error adding partitions 1-2
>>>>>
>>>>> I didn't have the warning about the old partition tables on the disk that worked. 
>>>>> So on this new disk, I have :
>>>>>
>>>>> [root@ceph1 ~]# mount /dev/sdc1 /mnt
>>>>> [root@ceph1 ~]# ll /mnt/
>>>>> total 16
>>>>> -rw-r--r-- 1 root root 37 Oct  9 15:58 ceph_fsid
>>>>> -rw-r--r-- 1 root root 37 Oct  9 15:58 fsid
>>>>> lrwxrwxrwx 1 root root 58 Oct  9 15:58 journal -> /dev/disk/by-partuuid/5e50bb8b-0b99-455f-af71-10815a32bfbc
>>>>> -rw-r--r-- 1 root root 37 Oct  9 15:58 journal_uuid
>>>>> -rw-r--r-- 1 root root 21 Oct  9 15:58 magic
>>>>>
>>>>> [root@ceph1 ~]# cat /mnt/journal_uuid
>>>>> 5e50bb8b-0b99-455f-af71-10815a32bfbc
>>>>>
>>>>> [root@ceph1 ~]# sgdisk --info=1 /dev/sdc
>>>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>>>>> Partition unique GUID: 244973DE-7472-421C-BB25-4B09D3F8D441
>>>>> First sector: 10487808 (at 5.0 GiB)
>>>>> Last sector: 1952448478 (at 931.0 GiB)
>>>>> Partition size: 1941960671 sectors (926.0 GiB)
>>>>> Attribute flags: 0000000000000000
>>>>> Partition name: 'ceph data'
>>>>>
>>>>> [root@ceph1 ~]# sgdisk --info=2 /dev/sdc
>>>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>>>>> Partition unique GUID: 5E50BB8B-0B99-455F-AF71-10815A32BFBC
>>>>> First sector: 2048 (at 1024.0 KiB)
>>>>> Last sector: 10485760 (at 5.0 GiB)
>>>>> Partition size: 10483713 sectors (5.0 GiB)
>>>>> Attribute flags: 0000000000000000
>>>>> Partition name: 'ceph journal'
>>>>>
>>>>> Puzzling, isn't it ?
>>>>>
>>>>>
>>>>
>>>> Yes :-) Just to be 100% sure, when you try to activate this /dev/sdc it shows an error and complains that the journal uuid is 0000-000* etc ? If so could you copy your udev debug output ?
>>>>
>>>> Cheers
>>>>
>>>> [>- FS : -<]  
>>>>
>>>> No, when I manually activate the disk instead of attempting to go the udev way, it seems to work :
>>>> [root@ceph1 ~]# ceph-disk activate /dev/sdc1
>>>> got monmap epoch 1
>>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> 2014-10-09 16:21:43.286288 7f2be6a027c0 -1 journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected 244973de-7472-421c-bb25-4b09d3f8d441, invalid (someone else's?) journal
>>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> 2014-10-09 16:21:43.301957 7f2be6a027c0 -1 filestore(/var/lib/ceph/tmp/mnt.4lJlzP) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
>>>> 2014-10-09 16:21:43.305941 7f2be6a027c0 -1 created object store /var/lib/ceph/tmp/mnt.4lJlzP journal /var/lib/ceph/tmp/mnt.4lJlzP/journal for osd.47 fsid 70ac4a78-46c0-45e6-8ff9-878b37f50fa1
>>>> 2014-10-09 16:21:43.305992 7f2be6a027c0 -1 auth: error reading file: /var/lib/ceph/tmp/mnt.4lJlzP/keyring: can't open /var/lib/ceph/tmp/mnt.4lJlzP/keyring: (2) No such file or directory
>>>> 2014-10-09 16:21:43.306099 7f2be6a027c0 -1 created new key in keyring /var/lib/ceph/tmp/mnt.4lJlzP/keyring
>>>> added key for osd.47
>>>> === osd.47 ===
>>>> create-or-move updating item name 'osd.47' weight 0.9 at location {host=ceph1,root=default} to crush map
>>>> Starting Ceph osd.47 on ceph1...
>>>> Running as unit run-12392.service.
>>>>
>>>> The osd then appeared in the osd tree...
>>>> I attached the logs to this email (I just added a set -x in the script called by udev, and redirected the output)
>>>>
>>>> Regards
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com