Re: Journal symlink broken / Ceph 0.94.5 / CentOS 6.7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jesper,

The goal of the rc.local is twofold but mainly to ensure the /dev/disk/by-partuuid symlinks exists for the journals. Is it the case ?

Cheers

On 18/12/2015 19:50, Jesper Thorhauge wrote:
> Hi Loic,
> 
> Damn, the updated udev didn't fix the problem :-(
> 
> The rc.local workaround is also complaining;
> 
> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sdc3
> libust[2648/2648]: Warning: HOME environment variable not set. Disabling LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305)
>  HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device
> DEBUG:ceph-disk:Journal /dev/sdc3 has OSD UUID 00000000-0000-0000-0000-000000000000
> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000
> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory
> ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2
> INFO:ceph-disk:Running command: /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sdc4
> libust[2687/2687]: Warning: HOME environment variable not set. Disabling LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305)
>  HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device
> DEBUG:ceph-disk:Journal /dev/sdc4 has OSD UUID 00000000-0000-0000-0000-000000000000
> INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000
> error: /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: No such file or directory
> ceph-disk: Cannot discover filesystem type: device /dev/disk/by-partuuid/00000000-0000-0000-0000-000000000000: Command '/sbin/blkid' returned non-zero exit status 2
> 
> /dev/sdc1 and /dev/sdc2 contains the boot loader and OS, so driverwise i guess things are working :-)
> 
> But "HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device" seems to be the underlying issue.
> 
> Any thoughts?
> 
> /Jesper
> 
> *****************
> 
> Hi Loic,
> 
> searched around for possible udev bugs, and then tried to run "yum update". Udev did have a fresh update with the following version diffs;
> 
> udev-147-2.63.el6_7.1.x86_64 --> udev-147-2.63.el6_7.1.x86_64
> 
> from what i can see this update fixes stuff related to symbolic links / external devices. /dev/sdc sits on external eSata. So...
> 
> https://rhn.redhat.com/errata/RHBA-2015-1382.html
> 
> will reboot tonight and get back :-)
> 
> /jesper
> 
> ***********************'
> 
> I guess that's the problem you need to solve : why /dev/sdc does not generate udev events (different driver than /dev/sda maybe ?). Once it does, Ceph should work.
> 
> A workaround could be to add somethink like:
> 
> ceph-disk-udev 3 sdc3 sdc
> ceph-disk-udev 4 sdc4 sdc
> 
> in /etc/rc.local.
> 
> On 17/12/2015 12:01, Jesper Thorhauge wrote:
>> Nope, the previous post contained all that was in the boot.log :-(
>>
>> /Jesper
>>
>> **********
>>
>> ----- Den 17. dec 2015, kl. 11:53, Loic Dachary <loic@xxxxxxxxxxx> skrev:
>>
>> On 17/12/2015 11:33, Jesper Thorhauge wrote:
>>> Hi Loic,
>>>
>>> Sounds like something does go wrong when /dev/sdc3 shows up. Is there anyway i can debug this further? Log-files? Modify the .rules file...?
>>
>> Do you see traces of what happens when /dev/sdc3 shows up in boot.log ?
>>
>>>
>>> /Jesper
>>>
>>> ****************
>>>
>>> The non-symlink files in /dev/disk/by-partuuid come to existence because of:
>>>
>>> * system boots
>>> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1
>>>   * ceph-disk-udev creates the symlink /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1
>>>   * ceph-disk activate /dev/sda1 is mounted and finds a symlink to the journal journal -> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 which does not yet exists because /dev/sdc udev rules have not been run yet
>>>   * ceph-osd opens the journal in write mode and that creates the file /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 as a regular file
>>>   * the file is empty and the osd fails to activate with the error you see (EINVAL because the file is empty)
>>>
>>> This is ok, supported and expected since there is no way to know which disk will show up first.
>>>
>>> When /dev/sdc shows up, the same logic will be triggered:
>>>
>>> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1
>>>   * ceph-disk-udev creates the symlink /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 -> ../../sdc3 (overriding the file because ln -sf)
>>>   * ceph-disk activate-journal /dev/sdc3 finds that c83b5aa5-fe77-42f6-9415-25ca0266fb7f is the data partition for that journal and mounts /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f
>>>   * ceph-osd opens the journal and all is well
>>>
>>> Except something goes wrong in your case, presumably because ceph-disk-udev is not called when /dev/sdc3 shows up ?
>>>
>>> On 17/12/2015 08:29, Jesper Thorhauge wrote:
>>>> Hi Loic,
>>>>
>>>> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4).
>>>>
>>>> sgdisk for sda shows;
>>>>
>>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>>>> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6
>>>> First sector: 2048 (at 1024.0 KiB)
>>>> Last sector: 1953525134 (at 931.5 GiB)
>>>> Partition size: 1953523087 sectors (931.5 GiB)
>>>> Attribute flags: 0000000000000000
>>>> Partition name: 'ceph data'
>>>>
>>>> for sdb
>>>>
>>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>>>> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F
>>>> First sector: 2048 (at 1024.0 KiB)
>>>> Last sector: 1953525134 (at 931.5 GiB)
>>>> Partition size: 1953523087 sectors (931.5 GiB)
>>>> Attribute flags: 0000000000000000
>>>> Partition name: 'ceph data'
>>>>
>>>> for /dev/sdc3
>>>>
>>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>>>> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072
>>>> First sector: 935813120 (at 446.2 GiB)
>>>> Last sector: 956293119 (at 456.0 GiB)
>>>> Partition size: 20480000 sectors (9.8 GiB)
>>>> Attribute flags: 0000000000000000
>>>> Partition name: 'ceph journal'
>>>>
>>>> for /dev/sdc4
>>>>
>>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>>>> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168
>>>> First sector: 956293120 (at 456.0 GiB)
>>>> Last sector: 976773119 (at 465.8 GiB)
>>>> Partition size: 20480000 sectors (9.8 GiB)
>>>> Attribute flags: 0000000000000000
>>>> Partition name: 'ceph journal'
>>>>
>>>> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it seems correct to me.
>>>>
>>>> after a reboot, /dev/disk/by-partuuid is;
>>>>
>>>> -rw-r--r-- 1 root root  0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168
>>>> -rw-r--r-- 1 root root  0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072
>>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1
>>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 -> ../../sda1
>>>>
>>>> i dont know how to verify the symlink of the journal file - can you guide me on that one?
>>>>
>>>> Thank :-) !
>>>>
>>>> /Jesper
>>>>
>>>> **************
>>>>
>>>> Hi,
>>>>
>>>> On 17/12/2015 07:53, Jesper Thorhauge wrote:
>>>>> Hi,
>>>>>
>>>>> Some more information showing in the boot.log;
>>>>>
>>>>> 2015-12-16 07:35:33.289830 7f1b990ad800 -1 filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.aWZTcE/journal: (22) Invalid argument
>>>>> 2015-12-16 07:35:33.289842 7f1b990ad800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -22
>>>>> 2015-12-16 07:35:33.289883 7f1b990ad800 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.aWZTcE: (22) Invalid argument
>>>>> ERROR:ceph-disk:Failed to activate
>>>>> ceph-disk: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '--mkkey', '-i', '7', '--monmap', '/var/lib/ceph/tmp/mnt.aWZTcE/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.aWZTcE', '--osd-journal', '/var/lib/ceph/tmp/mnt.aWZTcE/journal', '--osd-uuid', 'c83b5aa5-fe77-42f6-9415-25ca0266fb7f', '--keyring', '/var/lib/ceph/tmp/mnt.aWZTcE/keyring']' returned non-zero exit status 1
>>>>> ceph-disk: Error: One or more partitions failed to activate
>>>>>
>>>>> Maybe related to the "(22) Invalid argument" part..?
>>>>
>>>> After a reboot the symlinks are reconstructed and if they are still incorrect, it means there is an inconsistency somewhere else. To debug the problem, could you mount /dev/sda1 and verify the symlink of the journal file ? Then verify the content of /dev/disk/by-partuuid. And also display the partition information with sgdisk -i 1 /dev/sda and sgdisk -i 2 /dev/sda. Are you collocating your journal with the data, on the same disk ? Or are they on two different disks ?
>>>>
>>>> git log --no-merges --oneline tags/v0.94.3..tags/v0.94.5 udev
>>>>
>>>> shows nothing, meaning there has been no change to udev rules. There is one change related to the installation of the udev rules https://github.com/ceph/ceph/commit/4eb58ad2027148561d94bb43346b464b55d041a6. Could you double check 60-ceph-partuuid-workaround.rules is installed where it should ?
>>>>
>>>> Cheers
>>>>
>>>>>
>>>>> /Jesper
>>>>>
>>>>> *********************
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have done several reboots, and it did not lead to healthy symlinks :-(
>>>>>
>>>>> /Jesper
>>>>>
>>>>> ************
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 16/12/2015 07:39, Jesper Thorhauge wrote:
>>>>>> Hi,
>>>>>>
>>>>>> A fresh server install on one of my nodes (and yum update) left me with CentOS 6.7 / Ceph 0.94.5. All the other nodes are running Ceph 0.94.2.
>>>>>>
>>>>>> "ceph-disk prepare /dev/sda /dev/sdc" seems to work as expected, but "ceph-disk activate / dev/sda1" fails. I have traced the problem to "/dev/disk/by-partuuid", where the journal symlinks are broken;
>>>>>>
>>>>>> -rw-r--r-- 1 root root  0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168
>>>>>> -rw-r--r-- 1 root root  0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072
>>>>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1
>>>>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 -> ../../sda1
>>>>>>
>>>>>> Re-creating them manually wont survive a reboot. Is this a problem with the udev rules in Ceph 0.94.3+?
>>>>>
>>>>> This usually is a symptom of something else going wrong (i.e. it is possible to confuse the kernel into creating the wrong symbolic links). The correct symlinks should be set when you reboot.
>>>>>
>>>>>> Hope that somebody can help me :-)
>>>>>
>>>>> Please let us know if rebooting leads to healthy symlinks.
>>>>>
>>>>> Cheers
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Best regards,
>>>>>> Jesper
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>
>>>>> --
>>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>
>>>> --
>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>> --
>>> Loïc Dachary, Artisan Logiciel Libre
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 
> -- 
> Loïc Dachary, Artisan Logiciel Libre
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux