Thanks all for responses. It seems that both original responders relied on modification to the osd-prestart script. I can confirm that is working for me too and am using that as a temporary solution.
Respectfully,
Wes Dillingham
Jan, It seems that the log you mentioned shows the unit attempting to do the right thing here (chown the partition ceph:ceph) however it does not seem to take. Will continue to look and file a bug/pr if I make any progress. Here are the logs related to an affected osd (219) and its journal partition (/dev/sdc5). As a note this log output is from a server which has the added chown in the osd prestart script.
[2020-01-23 13:03:27,230][systemd][INFO ] raw systemd input received: lvm-219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
[2020-01-23 13:03:27,231][systemd][INFO ] parsed sub-command: lvm, extra data: 219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
[2020-01-23 13:03:27,305][ceph_volume.process][INFO ] Running command: /usr/sbin/ceph-volume lvm trigger 219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
[2020-01-23 13:03:28,235][ceph_volume.process][INFO ] stderr Running command: /bin/mount -t xfs -o rw,noatime,inode64 /dev/ceph-dd8d5283-fd1a-4114-9b53-6478dab7101c/osd-data-529ea347-b129-4b53-81cb-bb5f2d91f8ae /var/lib/ceph/osd/ceph-219
[2020-01-23 13:03:28,472][ceph_volume.process][INFO ] stderr Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-219
[2020-01-23 13:27:19,566][ceph_volume.process][INFO ] stderr Running command: /bin/ln -snf /dev/sdc5 /var/lib/ceph/osd/ceph-219/journal
[2020-01-23 13:27:19,571][ceph_volume.process][INFO ] stderr Running command: /bin/chown -R ceph:ceph /dev/sdc5
[2020-01-23 13:27:19,576][ceph_volume.process][INFO ] stderr Running command: /bin/systemctl enable ceph-volume@lvm-219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
[2020-01-23 13:27:19,672][ceph_volume.process][INFO ] stderr Running command: /bin/systemctl enable --runtime ceph-osd@219
[2020-01-23 13:27:19,679][ceph_volume.process][INFO ] stderr stderr: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@219.service to /usr/lib/systemd/system/ceph-osd@.service.
[2020-01-23 13:27:19,765][ceph_volume.process][INFO ] stderr Running command: /bin/systemctl start ceph-osd@219
[2020-01-23 13:27:19,773][ceph_volume.process][INFO ] stderr --> ceph-volume lvm activate successful for osd ID: 219
[2020-01-23 13:27:19,784][systemd][INFO ] successfully triggered activation for: 219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
[2020-01-23 13:03:27,231][systemd][INFO ] parsed sub-command: lvm, extra data: 219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
[2020-01-23 13:03:27,305][ceph_volume.process][INFO ] Running command: /usr/sbin/ceph-volume lvm trigger 219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
[2020-01-23 13:03:28,235][ceph_volume.process][INFO ] stderr Running command: /bin/mount -t xfs -o rw,noatime,inode64 /dev/ceph-dd8d5283-fd1a-4114-9b53-6478dab7101c/osd-data-529ea347-b129-4b53-81cb-bb5f2d91f8ae /var/lib/ceph/osd/ceph-219
[2020-01-23 13:03:28,472][ceph_volume.process][INFO ] stderr Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-219
[2020-01-23 13:27:19,566][ceph_volume.process][INFO ] stderr Running command: /bin/ln -snf /dev/sdc5 /var/lib/ceph/osd/ceph-219/journal
[2020-01-23 13:27:19,571][ceph_volume.process][INFO ] stderr Running command: /bin/chown -R ceph:ceph /dev/sdc5
[2020-01-23 13:27:19,576][ceph_volume.process][INFO ] stderr Running command: /bin/systemctl enable ceph-volume@lvm-219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
[2020-01-23 13:27:19,672][ceph_volume.process][INFO ] stderr Running command: /bin/systemctl enable --runtime ceph-osd@219
[2020-01-23 13:27:19,679][ceph_volume.process][INFO ] stderr stderr: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@219.service to /usr/lib/systemd/system/ceph-osd@.service.
[2020-01-23 13:27:19,765][ceph_volume.process][INFO ] stderr Running command: /bin/systemctl start ceph-osd@219
[2020-01-23 13:27:19,773][ceph_volume.process][INFO ] stderr --> ceph-volume lvm activate successful for osd ID: 219
[2020-01-23 13:27:19,784][systemd][INFO ] successfully triggered activation for: 219-529ea347-b129-4b53-81cb-bb5f2d91f8ae
Respectfully,
On Thu, Jan 23, 2020 at 4:31 AM Jan Fajerski <jfajerski@xxxxxxxx> wrote:
On Wed, Jan 22, 2020 at 12:00:28PM -0500, Wesley Dillingham wrote:
> After upgrading to Nautilus 14.2.6 from Luminous 12.2.12 we are seeing
> the following behavior on OSDs which were created with "ceph-volume lvm
> create --filestore --osd-id <osd> --data <device> --journal <journal>"
> Upon restart of the server containing these OSDs they fail to start
> with the following error in the logs:
>2020-01-21 13:36:11.635 7fee633e8a80 -1 filestore(/var/lib/ceph/osd/ceph-199) mo
>unt(1928): failed to open journal /var/lib/ceph/osd/ceph-199/journal: (13) Permi
>ssion denied
>
> /var/lib/ceph/osd/ceph-199/journal symlinks to /dev/sdc5 in our case
> and inspecting the ownership on /dev/sdc5 it is root:root, chowning
> that to ceph:ceph causes the osd to start and come back up and in near
> instantly.
> As a note these OSDs we experience this with are OSDs which have
> previously failed and been replaced using the above ceph-volume, longer
> running OSDs in the same server created with ceph-disk or ceph-volume
> simple (that have a corresponding .json in /etc/ceph/osd) start up fine
> and get ceph:ceph on their journal partition. Bluestore OSDs also do
> not have any issue.
> My hope is that I can preemptively fix these OSDs before shutting them
> down so that reboots happen seamlessly. Thanks for any insight.
ceph-volume is supposed to take care of this via the ceph-volume@ systemd unit.
This is a one shot unit, that should set things up and then start the osd.
The unit name is a bit convoluted: ceph-volume@<osd-id>-<osd-uuid>, there should
be symbolic link in /etc/systemd/system/multi-user.target.wants/
You can also check cat /var/log/ceph/ceph-volume-systemd.log for any errors.
Feel free to open a tracker ticket on
https://tracker.ceph.com/projects/ceph-volume
>
> Respectfully,
> Wes Dillingham
> [1]wes@xxxxxxxxxxxxxxxxx
> [2]LinkedIn
>
>References
>
> 1. mailto:wes@xxxxxxxxxxxxxxxxx
> 2. http://www.linkedin.com/in/wesleydillingham
>_______________________________________________
>ceph-users mailing list -- ceph-users@xxxxxxx
>To unsubscribe send an email to ceph-users-leave@xxxxxxx
--
Jan Fajerski
Senior Software Engineer Enterprise Storage
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx