Re: ceph-disk and /dev/dm-* permissions - race condition?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thats an interesting workaround, I may end up using it if all else fails.

I watch the permissions on /dev/dm-* devices during the boot
processes, they start out correctly as "ceph:ceph", but at the end of
the ceph disk preparation, a "ceph-disk trigger" is executed which
seems to cause the permissions to get reset back to "root:disk".  This
leaves the ceph-osd processes that are running able to continue, but
if they have to restart for any reason, they will fail to restart.

It could be a problem with the udev rules for the encrypted data and
journal partitions.  Debugging udev is a nightmare.  Im hoping someone
else has already solved this one.



On Sat, Nov 5, 2016 at 1:13 AM, Rajib Hossen
<rajib.hossen.ipvision@xxxxxxxxx> wrote:
> Hello,
> I had the similar issue. I solved it via a cronjob. In crontab -e
> "@reboot chown -R ceph:ceph /dev/vdb1". say my journal is in disk vdb and
> first partition(vdb1). vdb2 is my data disk.
>
> On Fri, Nov 4, 2016 at 8:51 PM, Wyllys Ingersoll
> <wyllys.ingersoll@xxxxxxxxxxxxxx> wrote:
>>
>> We are running 10.2.3 with encrypted OSDs and journals using the old
>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
>> after a reboot of a storage server.  Our data and journals are on
>> separate partitions on the same disk.
>>
>> After a reboot, sometimes the OSDs fail to start because of
>> permissions problems.  The /dev/dm-* devices come back with
>> permissions set to "root:disk" sometimes instead of "ceph:ceph".
>> Weirder still is that sometimes the ceph-osd will start and work in
>> spite of the incorrect perrmissions (root:disk) and other times they
>> will fail and the logs show permissions errors when trying to access
>> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
>> and others are "ceph:ceph".  There's no clear pattern, so that's what
>> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
>> function.
>>
>> Is there a known issue with ceph-disk and/or ceph-osd related to
>> timing of the encrypted devices being setup and the permissions
>> getting changed to the ceph processes can access them?
>>
>> Wyllys Ingersoll
>> Keeper Technology, LLC
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux