Hi Loic, With this example disk/machine that I left untouched until now : /dev/sdb : /dev/sdb1 ceph data, prepared, cluster ceph, osd.44, journal /dev/sdb2 /dev/sdb2 ceph journal, for /dev/sdb1 [root@ceph1 ~]# ll /dev/disk/by-partuuid/ total 0 lrwxrwxrwx 1 root root 10 Oct 9 15:09 2c27dbda-fbe3-48d6-80fe-b513e1c11702 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Oct 9 15:09 d2352e3b-f7f2-40c7-8273-8bfa8ab4206a -> ../../sdb2 This is the blkid output : [root@ceph1 ~]# blkid /dev/sdb2 [root@ceph1 ~]# blkid /dev/sdb1 /dev/sdb1: UUID="c8feaaad-bd83-41a3-a82a-0a8727d0b067" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="2c27dbda-fbe3-48d6-80fe-b513e1c11702" If I run "partx -u /dev/sdb", then the filesystem will get activated and the OSD started. And sometimes, it just works without intervention, but that's the exception. I modified the udev script this morning, so I can give you the output of what happens when things go wrong : links are created, but somewhere the UUIDD is wrongly detected by ceph-osd, as far as I understand : Thu Oct 9 11:15:13 CEST 2014 + PARTNO=2 + NAME=sde2 + PARENT_NAME=sde ++ /usr/sbin/sgdisk --info=2 /dev/sde ++ grep 'Partition GUID code' ++ awk '{print $4}' ++ tr '[:upper:]' '[:lower:]' + ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-b4b80ceff106 + '[' -z 45b0969e-9b03-4f30-b4c6-b4b80ceff106 ']' ++ /usr/sbin/sgdisk --info=2 /dev/sde ++ grep 'Partition unique GUID' ++ awk '{print $4}' ++ tr '[:upper:]' '[:lower:]' + ID_PART_ENTRY_UUID=a9e8d490-82a7-48c1-8ef1-aff92351c69c + mkdir -p /dev/disk/by-partuuid + ln -sf ../../sde2 /dev/disk/by-partuuid/a9e8d490-82a7-48c1-8ef1-aff92351c69c + mkdir -p /dev/disk/by-parttypeuuid + ln -sf ../../sde2 /dev/disk/by-parttypeuuid/45b0969e-9b03-4f30-b4c6-b4b80ceff106.a9e8d490-82a7-48c1-8ef1-aff92351c69c + case $ID_PART_ENTRY_TYPE in + /usr/sbin/ceph-disk -v activate-journal /dev/sde2 INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sde2 SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 DEBUG:ceph-disk:Journal /dev/sde2 has OSD UUID 00000000-0000-0000-0000-000000000000 INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000 error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2 + exit + exec regards Frederic. P.S : in your puppet module, it seems impossible to specify osd disks by path, i.e : ceph::profile::params::osds: '/dev/disk/by-path/pci-0000\:0a\:00.0-scsi-0\:2\:': (I tried without the backslashes too) -----Message d'origine----- De : Loic Dachary [mailto:loic@xxxxxxxxxxx] Envoyé : jeudi 9 octobre 2014 15:01 À : SCHAER Frederic; ceph-users@xxxxxxxxxxxxxx Objet : Re: ceph-dis prepare : UUID=00000000-0000-0000-0000-000000000000 Bonjour, I'm not familiar with RHEL7 but willing to learn ;-) I recently ran into confusing situations regarding the content of /dev/disk/by-partuuid because partprobe was not called when it should have (ubuntu). On RHEL, kpartx is used instead because partprobe reboots, apparently. What is the content of /dev/disk/by-partuuid on your machine ? ls -l /dev/disk/by-partuuid Cheers On 09/10/2014 12:24, SCHAER Frederic wrote: > Hi, > > > > I am setting up a test ceph cluster, on decommissioned hardware (hence : not optimal, I know). > > I have installed CentOS7, installed and setup ceph mons and OSD machines using puppet, and now I'm trying to add OSDs with the servers OSD disks. and I have issues (of course ;) ) > > I used the Ceph RHEL7 RPMs (ceph-0.80.6-0.el7.x86_64) > > > > When I run "ceph-disk prepare" for a disk, I most of the time (but not always) get the partitions created, but not activated : > > > > [root@ceph4 ~]# ceph-disk list|grep sdh > > WARNING:ceph-disk:Old blkid does not support ID_PART_ENTRY_* fields, trying sgdisk; may not correctly identify ceph volumes with dmcrypt > > /dev/sdh : > > /dev/sdh1 ceph data, prepared, cluster ceph, journal /dev/sdh2 > > /dev/sdh2 ceph journal, for /dev/sdh1 > > > > I tried to debug udev rules thinking they were not launched to activate the OSD, but they are, and they fail on this error : > > > > + ln -sf ../../sdh2 /dev/disk/by-partuuid/5b3bde8f-ccad-4093-a8a5-ad6413ae8931 > > + mkdir -p /dev/disk/by-parttypeuuid > > + ln -sf ../../sdh2 /dev/disk/by-parttypeuuid/45b0969e-9b03-4f30-b4c6-b4b80ceff106.5b3bde8f-ccad-4093-a8a5-ad6413ae8931 > > + case $ID_PART_ENTRY_TYPE in > > + /usr/sbin/ceph-disk -v activate-journal /dev/sdh2 > > INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sdh2 > > SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > DEBUG:ceph-disk:Journal /dev/sdh2 has OSD UUID 00000000-0000-0000-0000-000000000000 > > INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000 > > error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory > > ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2 > > + exit > > + exec > > > > You'll notice the zeroed UUID. > > Because of this, I looked at the output of ceph-disk prepare, and saw that partx complains at the end (this is the partx -a command) : > > > > Warning: The kernel is still using the old partition table. > > The new table will be used at the next reboot. > > The operation has completed successfully. > > partx: /dev/sdh: error adding partitions 1-2 > > > > And indeed, running "partx -a /dev/sdh" does not change anything. > > But I just discovered that running "partx -u /dev/sdh" will fix everything ..???? > > I.e : right after I send this update command to the kernel, my debug logs show that the udev rule does everything fine and the OSD starts up. > > > > I'm therefore wondering what I did wrong ? > > is this CentOS 7 that is misbehaving, or the kernel, or.? > > Any reason why partx -a is used instead of partx -u ? > > > > I'd be glad to hear others advice on this ! > > Thanks && regards > > > > Frederic Schaer > > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Loïc Dachary, Artisan Logiciel Libre _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com