Re: Centos 7 OSD silently fail to start

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Wed, 25 Feb 2015 15:03:40 -0700

Step #6 in http://ceph.com/docs/master/install/manual-deployment/#long-form
only set-ups the file structure for the OSD, it doesn't start the long
running process.

On Wed, Feb 25, 2015 at 2:59 PM, Kyle Hutson <kylehutson@xxxxxxx> wrote:
> But I already issued that command (back in step 6).
>
> The interesting part is that "ceph-disk activate" apparently does it
> correctly. Even after reboot, the services start as they should.
>
> On Wed, Feb 25, 2015 at 3:54 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx>
> wrote:
>>
>> I think that your problem lies with systemd (even though you are using
>> SysV syntax, systemd is really doing the work). Systemd does not like
>> multiple arguments and I think this is why it is failing. There is
>> supposed to be some work done to get systemd working ok, but I think
>> it has the limitation of only working with a cluster named 'ceph'
>> currently.
>>
>> What I did to get around the problem was to run the osd command manually:
>>
>> ceph-osd -i <osd#>
>>
>> Once I understand the under-the-hood stuff, I moved to ceph-disk and
>> now because of the GPT partition IDs, udev automatically starts up the
>> OSD process at boot/creation and moves to the appropiate CRUSH
>> location (configuratble in ceph.conf
>> http://ceph.com/docs/master/rados/operations/crush-map/#crush-location,
>> an example: crush location = host=test rack=rack3 row=row8
>> datacenter=local region=na-west root=default). To restart an OSD
>> process, I just kill the PID for the OSD then issue ceph-disk activate
>> /dev/sdx1 to restart the OSD process. You probably could stop it with
>> systemctl since I believe udev creates a resource for it (I should
>> probably look into that now that this system will be going production
>> soon).
>>
>> On Wed, Feb 25, 2015 at 2:13 PM, Kyle Hutson <kylehutson@xxxxxxx> wrote:
>> > I'm having a similar issue.
>> >
>> > I'm following http://ceph.com/docs/master/install/manual-deployment/ to
>> > a T.
>> >
>> > I have OSDs on the same host deployed with the short-form and they work
>> > fine. I am trying to deploy some more via the long form (because I want
>> > them
>> > to appear in a different location in the crush map). Everything through
>> > step
>> > 10 (i.e. ceph osd crush add {id-or-name} {weight}
>> > [{bucket-type}={bucket-name} ...] ) works just fine. When I go to step
>> > 11
>> > (sudo /etc/init.d/ceph start osd.{osd-num}) I get:
>> > /etc/init.d/ceph: osd.16 not found (/etc/ceph/ceph.conf defines
>> > mon.hobbit01
>> > osd.7 osd.15 osd.10 osd.9 osd.1 osd.14 osd.2 osd.3 osd.13 osd.8 osd.12
>> > osd.6
>> > osd.11 osd.5 osd.4 osd.0 , /var/lib/ceph defines mon.hobbit01 osd.7
>> > osd.15
>> > osd.10 osd.9 osd.1 osd.14 osd.2 osd.3 osd.13 osd.8 osd.12 osd.6 osd.11
>> > osd.5
>> > osd.4 osd.0)
>> >
>> >
>> >
>> > On Wed, Feb 25, 2015 at 11:55 AM, Travis Rhoden <trhoden@xxxxxxxxx>
>> > wrote:
>> >>
>> >> Also, did you successfully start your monitor(s), and define/create the
>> >> OSDs within the Ceph cluster itself?
>> >>
>> >> There are several steps to creating a Ceph cluster manually.  I'm
>> >> unsure
>> >> if you have done the steps to actually create and register the OSDs
>> >> with the
>> >> cluster.
>> >>
>> >>  - Travis
>> >>
>> >> On Wed, Feb 25, 2015 at 9:49 AM, Leszek Master <keksior@xxxxxxxxx>
>> >> wrote:
>> >>>
>> >>> Check firewall rules and selinux. It sometimes is a pain in the ... :)
>> >>>
>> >>> 25 lut 2015 01:46 "Barclay Jameson" <almightybeeij@xxxxxxxxx>
>> >>> napisał(a):
>> >>>
>> >>>> I have tried to install ceph using ceph-deploy but sgdisk seems to
>> >>>> have too many issues so I did a manual install. After mkfs.btrfs on
>> >>>> the disks and journals and mounted them I then tried to start the
>> >>>> osds
>> >>>> which failed. The first error was:
>> >>>> #/etc/init.d/ceph start osd.0
>> >>>> /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines ,
>> >>>> /var/lib/ceph defines )
>> >>>>
>> >>>> I then manually added the osds to the conf file with the following as
>> >>>> an example:
>> >>>> [osd.0]
>> >>>>     osd_host = node01
>> >>>>
>> >>>> Now when I run the command :
>> >>>> # /etc/init.d/ceph start osd.0
>> >>>>
>> >>>> There is no error or output from the command and in fact when I do a
>> >>>> ceph -s no osds are listed as being up.
>> >>>> Doing as ps aux | grep -i ceph or ps aux | grep -i osd shows there
>> >>>> are
>> >>>> no osd running.
>> >>>> I also have done htop to see if any process are running and none are
>> >>>> shown.
>> >>>>
>> >>>> I had this working on SL6.5 with Firefly but Giant on Centos 7 has
>> >>>> been nothing but a giant pain.
>> >>>> _______________________________________________
>> >>>> ceph-users mailing list
>> >>>> ceph-users@xxxxxxxxxxxxxx
>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> ceph-users mailing list
>> >>> ceph-users@xxxxxxxxxxxxxx
>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>
>> >>
>> >>
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> ceph-users@xxxxxxxxxxxxxx
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com