I dont know, but making the change in the 55-dm.rules file seems to do the trick well enough for now. On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@xxxxxxxxxxx> wrote: > > > On 22/11/2016 16:13, Wyllys Ingersoll wrote: >> I think that sounds reasonable, obviously more testing will be needed >> to verify. Our situation occurred on an Ubuntu Trusty (upstart based, >> not systemd) server, so I dont think this will help for non-systemd >> systems. > > I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research. > >> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@xxxxxxxxxxx> wrote: >>> Hi, >>> >>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation. >>> >>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do. >>> >>> What do you think ? >>> >>> On 04/11/2016 15:51, Wyllys Ingersoll wrote: >>>> We are running 10.2.3 with encrypted OSDs and journals using the old >>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes >>>> after a reboot of a storage server. Our data and journals are on >>>> separate partitions on the same disk. >>>> >>>> After a reboot, sometimes the OSDs fail to start because of >>>> permissions problems. The /dev/dm-* devices come back with >>>> permissions set to "root:disk" sometimes instead of "ceph:ceph". >>>> Weirder still is that sometimes the ceph-osd will start and work in >>>> spite of the incorrect perrmissions (root:disk) and other times they >>>> will fail and the logs show permissions errors when trying to access >>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk" >>>> and others are "ceph:ceph". There's no clear pattern, so that's what >>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map" >>>> function. >>>> >>>> Is there a known issue with ceph-disk and/or ceph-osd related to >>>> timing of the encrypted devices being setup and the permissions >>>> getting changed to the ceph processes can access them? >>>> >>>> Wyllys Ingersoll >>>> Keeper Technology, LLC >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >>> -- >>> Loïc Dachary, Artisan Logiciel Libre >> > > -- > Loïc Dachary, Artisan Logiciel Libre -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html