Re: ceph-dis prepare : UUID=00000000-0000-0000-0000-000000000000

Loic Dachary <loic@xxxxxxxxxxx> · Thu, 09 Oct 2014 16:20:13 +0200

On 09/10/2014 16:04, SCHAER Frederic wrote:
> Hi Loic,
> 
> Back on sdb, as the sde output was from another machine on which I ran partx -u afterwards.
> To reply your last question first : I think the SG_IO error comes from the fact that disks are exported as a single disks RAID0 on a PERC 6/E, which does not support JBOD - this is decommissioned hardware on which I'd like to test and validate we can use ceph for our use case...
> 
> So back on the  UUID.
> It's funny : I retried and ceph-disk prepare worked this time. I tried on another disk, and it failed.
> There is a difference in the output from ceph-disk : on the failing disk, I have these extra lines after disks are prepared :
> 
> (...)
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> Warning: The kernel is still using the old partition table.
> The new table will be used at the next reboot.
> The operation has completed successfully.
> partx: /dev/sdc: error adding partitions 1-2
> 
> I didn't have the warning about the old partition tables on the disk that worked. 
> So on this new disk, I have :
> 
> [root@ceph1 ~]# mount /dev/sdc1 /mnt
> [root@ceph1 ~]# ll /mnt/
> total 16
> -rw-r--r-- 1 root root 37 Oct  9 15:58 ceph_fsid
> -rw-r--r-- 1 root root 37 Oct  9 15:58 fsid
> lrwxrwxrwx 1 root root 58 Oct  9 15:58 journal -> /dev/disk/by-partuuid/5e50bb8b-0b99-455f-af71-10815a32bfbc
> -rw-r--r-- 1 root root 37 Oct  9 15:58 journal_uuid
> -rw-r--r-- 1 root root 21 Oct  9 15:58 magic
> 
> [root@ceph1 ~]# cat /mnt/journal_uuid
> 5e50bb8b-0b99-455f-af71-10815a32bfbc
> 
> [root@ceph1 ~]# sgdisk --info=1 /dev/sdc
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
> Partition unique GUID: 244973DE-7472-421C-BB25-4B09D3F8D441
> First sector: 10487808 (at 5.0 GiB)
> Last sector: 1952448478 (at 931.0 GiB)
> Partition size: 1941960671 sectors (926.0 GiB)
> Attribute flags: 0000000000000000
> Partition name: 'ceph data'
> 
> [root@ceph1 ~]# sgdisk --info=2 /dev/sdc
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
> Partition unique GUID: 5E50BB8B-0B99-455F-AF71-10815A32BFBC
> First sector: 2048 (at 1024.0 KiB)
> Last sector: 10485760 (at 5.0 GiB)
> Partition size: 10483713 sectors (5.0 GiB)
> Attribute flags: 0000000000000000
> Partition name: 'ceph journal'
> 
> Puzzling, isn't it ?
> 
> 

Yes :-) Just to be 100% sure, when you try to activate this /dev/sdc it shows an error and complains that the journal uuid is 0000-000* etc ? If so could you copy your udev debug output ?

Cheers

> -----Message d'origine-----
> De : Loic Dachary [mailto:loic@xxxxxxxxxxx] 
> Envoyé : jeudi 9 octobre 2014 15:37
> À : SCHAER Frederic; ceph-users@xxxxxxxxxxxxxx
> Objet : Re:  ceph-dis prepare : UUID=00000000-0000-0000-0000-000000000000
> 
> 
> Does what do sgdisk --info=1 /dev/sde and sgdisk --info=2 /dev/sde print ?
> 
> It looks like the journal points to an incorrect location (you should see this by mounting /dev/sde1). Here is what I have on a cluster
> 
> root@bm0015:~# ls -l /var/lib/ceph/osd/ceph-1/
> total 56
> -rw-r--r--   1 root root  192 Nov  2  2013 activate.monmap
> -rw-r--r--   1 root root    3 Nov  2  2013 active
> -rw-r--r--   1 root root   37 Nov  2  2013 ceph_fsid
> drwxr-xr-x 114 root root 8192 Sep 14 11:01 current
> -rw-r--r--   1 root root   37 Nov  2  2013 fsid
> lrwxrwxrwx   1 root root   58 Nov  2  2013 journal -> /dev/disk/by-partuuid/7e811295-1b45-477d-907a-41c4c90d9687
> -rw-r--r--   1 root root   37 Nov  2  2013 journal_uuid
> -rw-------   1 root root   56 Nov  2  2013 keyring
> -rw-r--r--   1 root root   21 Nov  2  2013 magic
> -rw-r--r--   1 root root    6 Nov  2  2013 ready
> -rw-r--r--   1 root root    4 Nov  2  2013 store_version
> -rw-r--r--   1 root root   42 Dec 27  2013 superblock
> -rw-r--r--   1 root root    0 May  2 14:01 upstart
> -rw-r--r--   1 root root    2 Nov  2  2013 whoami
> root@bm0015:~# cat /var/lib/ceph/osd/ceph-1/journal_uuid
> 7e811295-1b45-477d-907a-41c4c90d9687
> root@bm0015:~#
> 
> I guess in your case the content of journal_uuid is 00000-0000 etc. for some reason.
> 
> Do you know where that
> 
> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 
> comes from ?
> 
> On 09/10/2014 15:20, SCHAER Frederic wrote:
>> Hi Loic,
>>
>> With this example disk/machine that I left untouched until now :
>>
>> /dev/sdb :
>>  /dev/sdb1 ceph data, prepared, cluster ceph, osd.44, journal /dev/sdb2
>>  /dev/sdb2 ceph journal, for /dev/sdb1
>>
>> [root@ceph1 ~]# ll /dev/disk/by-partuuid/
>> total 0
>> lrwxrwxrwx 1 root root 10 Oct  9 15:09 2c27dbda-fbe3-48d6-80fe-b513e1c11702 -> ../../sdb1
>> lrwxrwxrwx 1 root root 10 Oct  9 15:09 d2352e3b-f7f2-40c7-8273-8bfa8ab4206a -> ../../sdb2
>>
>> This is the blkid output :
>>
>> [root@ceph1 ~]# blkid  /dev/sdb2
>> [root@ceph1 ~]# blkid  /dev/sdb1
>> /dev/sdb1: UUID="c8feaaad-bd83-41a3-a82a-0a8727d0b067" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="2c27dbda-fbe3-48d6-80fe-b513e1c11702"
>>
>> If I run "partx -u /dev/sdb", then the filesystem will get activated and the OSD started.
>> And sometimes, it just works without intervention, but that's the exception.
>>
>> I modified the udev script this morning, so I can give you the output of what happens when things go wrong : links are created, but somewhere the UUIDD is wrongly detected by ceph-osd, as far as I understand :
>>
>> Thu Oct  9 11:15:13 CEST 2014
>> + PARTNO=2
>> + NAME=sde2
>> + PARENT_NAME=sde
>> ++ /usr/sbin/sgdisk --info=2 /dev/sde
>> ++ grep 'Partition GUID code'
>> ++ awk '{print $4}'
>> ++ tr '[:upper:]' '[:lower:]'
>> + ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-b4b80ceff106
>> + '[' -z 45b0969e-9b03-4f30-b4c6-b4b80ceff106 ']'
>> ++ /usr/sbin/sgdisk --info=2 /dev/sde
>> ++ grep 'Partition unique GUID'
>> ++ awk '{print $4}'
>> ++ tr '[:upper:]' '[:lower:]'
>> + ID_PART_ENTRY_UUID=a9e8d490-82a7-48c1-8ef1-aff92351c69c
>> + mkdir -p /dev/disk/by-partuuid
>> + ln -sf ../../sde2 /dev/disk/by-partuuid/a9e8d490-82a7-48c1-8ef1-aff92351c69c
>> + mkdir -p /dev/disk/by-parttypeuuid
>> + ln -sf ../../sde2 /dev/disk/by-parttypeuuid/45b0969e-9b03-4f30-b4c6-b4b80ceff106.a9e8d490-82a7-48c1-8ef1-aff92351c69c
>> + case $ID_PART_ENTRY_TYPE in
>> + /usr/sbin/ceph-disk -v activate-journal /dev/sde2
>> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sde2
>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> DEBUG:ceph-disk:Journal /dev/sde2 has OSD UUID 00000000-0000-0000-0000-000000000000
>> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000
>> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory
>> ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2
>> + exit
>> + exec
>>
>> regards
>>
>> Frederic.
>>
>> P.S : in your puppet module, it seems impossible to specify osd disks by path, i.e : 
>> ceph::profile::params::osds:
>>   '/dev/disk/by-path/pci-0000\:0a\:00.0-scsi-0\:2\:':
>> (I tried without the backslashes too)
>>
>> -----Message d'origine-----
>> De : Loic Dachary [mailto:loic@xxxxxxxxxxx] 
>> Envoyé : jeudi 9 octobre 2014 15:01
>> À : SCHAER Frederic; ceph-users@xxxxxxxxxxxxxx
>> Objet : Re:  ceph-dis prepare : UUID=00000000-0000-0000-0000-000000000000
>>
>> Bonjour,
>>
>> I'm not familiar with RHEL7 but willing to learn ;-) I recently ran into confusing situations regarding the content of /dev/disk/by-partuuid because partprobe was not called when it should have (ubuntu). On RHEL, kpartx is used instead because partprobe reboots, apparently. What is the content of /dev/disk/by-partuuid on your machine ?
>>
>> ls -l /dev/disk/by-partuuid 
>>
>> Cheers
>>
>> On 09/10/2014 12:24, SCHAER Frederic wrote:
>>> Hi,
>>>
>>>  
>>>
>>> I am setting up a test ceph cluster, on decommissioned  hardware (hence : not optimal, I know).
>>>
>>> I have installed CentOS7, installed and setup ceph mons and OSD machines using puppet, and now I'm trying to add OSDs with the servers OSD disks. and I have issues (of course ;) )
>>>
>>> I used the Ceph RHEL7 RPMs (ceph-0.80.6-0.el7.x86_64)
>>>
>>>  
>>>
>>> When I run "ceph-disk prepare" for a disk, I most of the time (but not always) get the partitions created, but not activated :
>>>
>>>  
>>>
>>> [root@ceph4 ~]# ceph-disk list|grep sdh
>>>
>>> WARNING:ceph-disk:Old blkid does not support ID_PART_ENTRY_* fields, trying sgdisk; may not correctly identify ceph volumes with dmcrypt
>>>
>>> /dev/sdh :
>>>
>>> /dev/sdh1 ceph data, prepared, cluster ceph, journal /dev/sdh2
>>>
>>> /dev/sdh2 ceph journal, for /dev/sdh1
>>>
>>>  
>>>
>>> I tried to debug udev rules thinking they were not launched to activate the OSD, but they are, and they fail on this error :
>>>
>>>  
>>>
>>> + ln -sf ../../sdh2 /dev/disk/by-partuuid/5b3bde8f-ccad-4093-a8a5-ad6413ae8931
>>>
>>> + mkdir -p /dev/disk/by-parttypeuuid
>>>
>>> + ln -sf ../../sdh2 /dev/disk/by-parttypeuuid/45b0969e-9b03-4f30-b4c6-b4b80ceff106.5b3bde8f-ccad-4093-a8a5-ad6413ae8931
>>>
>>> + case $ID_PART_ENTRY_TYPE in
>>>
>>> + /usr/sbin/ceph-disk -v activate-journal /dev/sdh2
>>>
>>> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sdh2
>>>
>>> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>
>>> DEBUG:ceph-disk:Journal /dev/sdh2 has OSD UUID 00000000-0000-0000-0000-000000000000
>>>
>>> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000
>>>
>>> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory
>>>
>>> ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2
>>>
>>> + exit
>>>
>>> + exec
>>>
>>>  
>>>
>>> You'll notice the zeroed UUID.
>>>
>>> Because of this, I looked at the output of ceph-disk prepare, and saw that partx complains at the end (this is the partx -a command) :
>>>
>>>  
>>>
>>> Warning: The kernel is still using the old partition table.
>>>
>>> The new table will be used at the next reboot.
>>>
>>> The operation has completed successfully.
>>>
>>> partx: /dev/sdh: error adding partitions 1-2
>>>
>>>  
>>>
>>> And indeed, running "partx -a /dev/sdh" does not change anything.
>>>
>>> But I just discovered that running "partx -u /dev/sdh" will fix everything ..????
>>>
>>> I.e : right after I send this update command to the kernel, my debug logs show that the udev rule does everything fine and the OSD starts up.
>>>
>>>  
>>>
>>> I'm therefore wondering what I did wrong ?
>>>
>>> is this CentOS 7 that is misbehaving, or the kernel, or.?
>>>
>>> Any reason why partx -a is used instead of partx -u ?
>>>
>>>  
>>>
>>> I'd be glad to hear others advice on this !
>>>
>>> Thanks && regards
>>>
>>>  
>>>
>>> Frederic Schaer
>>>
>>>  
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com