On 18/12/2015 22:09, Jesper Thorhauge wrote: > Hi Loic, > > Getting closer! > > lrwxrwxrwx 1 root root 10 Dec 18 19:43 1e9d527f-0866-4284-b77c-c1cb04c5a168 -> ../../sdc4 > lrwxrwxrwx 1 root root 10 Dec 18 19:43 c34d4694-b486-450d-b57f-da24255f0072 -> ../../sdc3 > lrwxrwxrwx 1 root root 10 Dec 18 19:42 c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1 > lrwxrwxrwx 1 root root 10 Dec 18 19:42 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 -> ../../sda1 > > So symlinks are now working! Activating an OSD is a different story :-( > > "ceph-disk -vv activate /dev/sda1" gives me; > > INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/sda1 > INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs > INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs > DEBUG:ceph-disk:Mounting /dev/sda1 on /var/lib/ceph/tmp/mnt.A99cDp with options noatime,inode64 > INFO:ceph-disk:Running command: /bin/mount -t xfs -o noatime,inode64 -- /dev/sda1 /var/lib/ceph/tmp/mnt.A99cDp > DEBUG:ceph-disk:Cluster uuid is 07b5c90b-6cae-40c0-93b2-31e0ebad7315 > INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid > DEBUG:ceph-disk:Cluster name is ceph > DEBUG:ceph-disk:OSD uuid is e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 > DEBUG:ceph-disk:OSD id is 6 > DEBUG:ceph-disk:Initializing OSD... > INFO:ceph-disk:Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/tmp/mnt.A99cDp/activate.monmap > got monmap epoch 6 > INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 6 --monmap /var/lib/ceph/tmp/mnt.A99cDp/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.A99cDp --osd-journal /var/lib/ceph/tmp/mnt.A99cDp/journal --osd-uuid e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 --keyring /var/lib/ceph/tmp/mnt.A99cDp/keyring > HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device > 2015-12-18 21:58:12.489357 7f266d7b0800 -1 journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected e85f4d92-c8f1-4591-bd2a-aa43b80f58f6, invalid (someone else's?) journal > HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device > HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device > HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device > 2015-12-18 21:58:12.680566 7f266d7b0800 -1 filestore(/var/lib/ceph/tmp/mnt.A99cDp) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory > 2015-12-18 21:58:12.865810 7f266d7b0800 -1 created object store /var/lib/ceph/tmp/mnt.A99cDp journal /var/lib/ceph/tmp/mnt.A99cDp/journal for osd.6 fsid 07b5c90b-6cae-40c0-93b2-31e0ebad7315 > 2015-12-18 21:58:12.865844 7f266d7b0800 -1 auth: error reading file: /var/lib/ceph/tmp/mnt.A99cDp/keyring: can't open /var/lib/ceph/tmp/mnt.A99cDp/keyring: (2) No such file or directory > 2015-12-18 21:58:12.865910 7f266d7b0800 -1 created new key in keyring /var/lib/ceph/tmp/mnt.A99cDp/keyring > INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup init > DEBUG:ceph-disk:Marking with init system sysvinit > DEBUG:ceph-disk:Authorizing OSD key... > INFO:ceph-disk:Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.6 -i /var/lib/ceph/tmp/mnt.A99cDp/keyring osd allow * mon allow profile osd > Error EINVAL: entity osd.6 exists but key does not match > ERROR:ceph-disk:Failed to activate > DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.A99cDp > INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.A99cDp > Traceback (most recent call last): > File "/usr/sbin/ceph-disk", line 2994, in <module> > main() > File "/usr/sbin/ceph-disk", line 2972, in main > args.func(args) > File "/usr/sbin/ceph-disk", line 2178, in main_activate > init=args.mark_init, > File "/usr/sbin/ceph-disk", line 1954, in mount_activate > (osd_id, cluster) = activate(path, activate_key_template, init) > File "/usr/sbin/ceph-disk", line 2153, in activate > keyring=keyring, > File "/usr/sbin/ceph-disk", line 1756, in auth_key > 'mon', 'allow profile osd', > File "/usr/sbin/ceph-disk", line 323, in command_check_call > return subprocess.check_call(arguments) > File "/usr/lib64/python2.6/subprocess.py", line 505, in check_call > raise CalledProcessError(retcode, cmd) > subprocess.CalledProcessError: Command '['/usr/bin/ceph', '--cluster', 'ceph', '--name', 'client.bootstrap-osd', '--keyring', '/var/lib/ceph/bootstrap-osd/ceph.keyring', 'auth', 'add', 'osd.6', '-i', '/var/lib/ceph/tmp/mnt.A99cDp/keyring', 'osd', 'allow *', 'mon', 'allow profile osd']' returned non-zero exit status 22 This is a different problem, osd.6 seems to be resurected with wrong permissions. I suggest you zap the disks and start over. Cheers > Thanks! > > /Jesper > > *************** > > Hi Jesper, > > The goal of the rc.local is twofold but mainly to ensure the /dev/disk/by-partuuid symlinks exists for the journals. Is it the case ? > > Cheers > > On 18/12/2015 19:50, Jesper Thorhauge wrote: >> Hi Loic, >> >> Damn, the updated udev didn't fix the problem :-( >> >> The rc.local workaround is also complaining; >> >> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sdc3 >> libust[2648/2648]: Warning: HOME environment variable not set. Disabling LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305) >> HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device >> DEBUG:ceph-disk:Journal /dev/sdc3 has OSD UUID 00000000-0000-0000-0000-000000000000 >> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000 >> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory >> ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2 >> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sdc4 >> libust[2687/2687]: Warning: HOME environment variable not set. Disabling LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305) >> HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device >> DEBUG:ceph-disk:Journal /dev/sdc4 has OSD UUID 00000000-0000-0000-0000-000000000000 >> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000 >> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory >> ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2 >> >> /dev/sdc1 and /dev/sdc2 contains the boot loader and OS, so driverwise i guess things are working :-) >> >> But "HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device" seems to be the underlying issue. >> >> Any thoughts? >> >> /Jesper >> >> ***************** >> >> Hi Loic, >> >> searched around for possible udev bugs, and then tried to run "yum update". Udev did have a fresh update with the following version diffs; >> >> udev-147-2.63.el6_7.1.x86_64 --> udev-147-2.63.el6_7.1.x86_64 >> >> from what i can see this update fixes stuff related to symbolic links / external devices. /dev/sdc sits on external eSata. So... >> >> https://rhn.redhat.com/errata/RHBA-2015-1382.html >> >> will reboot tonight and get back :-) >> >> /jesper >> >> ***********************' >> >> I guess that's the problem you need to solve : why /dev/sdc does not generate udev events (different driver than /dev/sda maybe ?). Once it does, Ceph should work. >> >> A workaround could be to add somethink like: >> >> ceph-disk-udev 3 sdc3 sdc >> ceph-disk-udev 4 sdc4 sdc >> >> in /etc/rc.local. >> >> On 17/12/2015 12:01, Jesper Thorhauge wrote: >>> Nope, the previous post contained all that was in the boot.log :-( >>> >>> /Jesper >>> >>> ********** >>> >>> ----- Den 17. dec 2015, kl. 11:53, Loic Dachary <loic@xxxxxxxxxxx> skrev: >>> >>> On 17/12/2015 11:33, Jesper Thorhauge wrote: >>>> Hi Loic, >>>> >>>> Sounds like something does go wrong when /dev/sdc3 shows up. Is there anyway i can debug this further? Log-files? Modify the .rules file...? >>> >>> Do you see traces of what happens when /dev/sdc3 shows up in boot.log ? >>> >>>> >>>> /Jesper >>>> >>>> **************** >>>> >>>> The non-symlink files in /dev/disk/by-partuuid come to existence because of: >>>> >>>> * system boots >>>> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1 >>>> * ceph-disk-udev creates the symlink /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1 >>>> * ceph-disk activate /dev/sda1 is mounted and finds a symlink to the journal journal -> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 which does not yet exists because /dev/sdc udev rules have not been run yet >>>> * ceph-osd opens the journal in write mode and that creates the file /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 as a regular file >>>> * the file is empty and the osd fails to activate with the error you see (EINVAL because the file is empty) >>>> >>>> This is ok, supported and expected since there is no way to know which disk will show up first. >>>> >>>> When /dev/sdc shows up, the same logic will be triggered: >>>> >>>> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1 >>>> * ceph-disk-udev creates the symlink /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 -> ../../sdc3 (overriding the file because ln -sf) >>>> * ceph-disk activate-journal /dev/sdc3 finds that c83b5aa5-fe77-42f6-9415-25ca0266fb7f is the data partition for that journal and mounts /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f >>>> * ceph-osd opens the journal and all is well >>>> >>>> Except something goes wrong in your case, presumably because ceph-disk-udev is not called when /dev/sdc3 shows up ? >>>> >>>> On 17/12/2015 08:29, Jesper Thorhauge wrote: >>>>> Hi Loic, >>>>> >>>>> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4). >>>>> >>>>> sgdisk for sda shows; >>>>> >>>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) >>>>> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6 >>>>> First sector: 2048 (at 1024.0 KiB) >>>>> Last sector: 1953525134 (at 931.5 GiB) >>>>> Partition size: 1953523087 sectors (931.5 GiB) >>>>> Attribute flags: 0000000000000000 >>>>> Partition name: 'ceph data' >>>>> >>>>> for sdb >>>>> >>>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) >>>>> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F >>>>> First sector: 2048 (at 1024.0 KiB) >>>>> Last sector: 1953525134 (at 931.5 GiB) >>>>> Partition size: 1953523087 sectors (931.5 GiB) >>>>> Attribute flags: 0000000000000000 >>>>> Partition name: 'ceph data' >>>>> >>>>> for /dev/sdc3 >>>>> >>>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) >>>>> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072 >>>>> First sector: 935813120 (at 446.2 GiB) >>>>> Last sector: 956293119 (at 456.0 GiB) >>>>> Partition size: 20480000 sectors (9.8 GiB) >>>>> Attribute flags: 0000000000000000 >>>>> Partition name: 'ceph journal' >>>>> >>>>> for /dev/sdc4 >>>>> >>>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) >>>>> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168 >>>>> First sector: 956293120 (at 456.0 GiB) >>>>> Last sector: 976773119 (at 465.8 GiB) >>>>> Partition size: 20480000 sectors (9.8 GiB) >>>>> Attribute flags: 0000000000000000 >>>>> Partition name: 'ceph journal' >>>>> >>>>> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it seems correct to me. >>>>> >>>>> after a reboot, /dev/disk/by-partuuid is; >>>>> >>>>> -rw-r--r-- 1 root root 0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168 >>>>> -rw-r--r-- 1 root root 0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072 >>>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1 >>>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 -> ../../sda1 >>>>> >>>>> i dont know how to verify the symlink of the journal file - can you guide me on that one? >>>>> >>>>> Thank :-) ! >>>>> >>>>> /Jesper >>>>> >>>>> ************** >>>>> >>>>> Hi, >>>>> >>>>> On 17/12/2015 07:53, Jesper Thorhauge wrote: >>>>>> Hi, >>>>>> >>>>>> Some more information showing in the boot.log; >>>>>> >>>>>> 2015-12-16 07:35:33.289830 7f1b990ad800 -1 filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.aWZTcE/journal: (22) Invalid argument >>>>>> 2015-12-16 07:35:33.289842 7f1b990ad800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -22 >>>>>> 2015-12-16 07:35:33.289883 7f1b990ad800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.aWZTcE: (22) Invalid argument >>>>>> ERROR:ceph-disk:Failed to activate >>>>>> ceph-disk: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '--mkkey', '-i', '7', '--monmap', '/var/lib/ceph/tmp/mnt.aWZTcE/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.aWZTcE', '--osd-journal', '/var/lib/ceph/tmp/mnt.aWZTcE/journal', '--osd-uuid', 'c83b5aa5-fe77-42f6-9415-25ca0266fb7f', '--keyring', '/var/lib/ceph/tmp/mnt.aWZTcE/keyring']' returned non-zero exit status 1 >>>>>> ceph-disk: Error: One or more partitions failed to activate >>>>>> >>>>>> Maybe related to the "(22) Invalid argument" part..? >>>>> >>>>> After a reboot the symlinks are reconstructed and if they are still incorrect, it means there is an inconsistency somewhere else. To debug the problem, could you mount /dev/sda1 and verify the symlink of the journal file ? Then verify the content of /dev/disk/by-partuuid. And also display the partition information with sgdisk -i 1 /dev/sda and sgdisk -i 2 /dev/sda. Are you collocating your journal with the data, on the same disk ? Or are they on two different disks ? >>>>> >>>>> git log --no-merges --oneline tags/v0.94.3..tags/v0.94.5 udev >>>>> >>>>> shows nothing, meaning there has been no change to udev rules. There is one change related to the installation of the udev rules https://github.com/ceph/ceph/commit/4eb58ad2027148561d94bb43346b464b55d041a6. Could you double check 60-ceph-partuuid-workaround.rules is installed where it should ? >>>>> >>>>> Cheers >>>>> >>>>>> >>>>>> /Jesper >>>>>> >>>>>> ********************* >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have done several reboots, and it did not lead to healthy symlinks :-( >>>>>> >>>>>> /Jesper >>>>>> >>>>>> ************ >>>>>> >>>>>> Hi, >>>>>> >>>>>> On 16/12/2015 07:39, Jesper Thorhauge wrote: >>>>>>> Hi, >>>>>>> >>>>>>> A fresh server install on one of my nodes (and yum update) left me with CentOS 6.7 / Ceph 0.94.5. All the other nodes are running Ceph 0.94.2. >>>>>>> >>>>>>> "ceph-disk prepare /dev/sda /dev/sdc" seems to work as expected, but "ceph-disk activate / dev/sda1" fails. I have traced the problem to "/dev/disk/by-partuuid", where the journal symlinks are broken; >>>>>>> >>>>>>> -rw-r--r-- 1 root root 0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168 >>>>>>> -rw-r--r-- 1 root root 0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072 >>>>>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1 >>>>>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 -> ../../sda1 >>>>>>> >>>>>>> Re-creating them manually wont survive a reboot. Is this a problem with the udev rules in Ceph 0.94.3+? >>>>>> >>>>>> This usually is a symptom of something else going wrong (i.e. it is possible to confuse the kernel into creating the wrong symbolic links). The correct symlinks should be set when you reboot. >>>>>> >>>>>>> Hope that somebody can help me :-) >>>>>> >>>>>> Please let us know if rebooting leads to healthy symlinks. >>>>>> >>>>>> Cheers >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Best regards, >>>>>>> Jesper >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list >>>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> >>>>>> >>>>>> -- >>>>>> Loïc Dachary, Artisan Logiciel Libre >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ceph-users mailing list >>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ceph-users mailing list >>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>> >>>>> >>>>> -- >>>>> Loïc Dachary, Artisan Logiciel Libre >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@xxxxxxxxxxxxxx >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>> >>>> -- >>>> Loïc Dachary, Artisan Logiciel Libre >>>> >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >>> -- >>> Loïc Dachary, Artisan Logiciel Libre >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >> -- >> Loïc Dachary, Artisan Logiciel Libre >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > -- > Loïc Dachary, Artisan Logiciel Libre > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com