Re: Problem with Ceph daemons

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can you retry after resetting the systemd unit? The message "Start request repeated too quickly." should be cleared first, then start it again:

systemctl reset-failed ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service systemctl start ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service

Then check the logs again. If there's still nothing in the rgw log then you'll need to check the (active) mgr daemon logs for anything suspicious and also the syslog on that rgw host. Is the rest of the cluster healthy? Are rgw daemons colocated with other services?


Zitat von Ron Gage <ron@xxxxxxxxxxx>:

Adam:



Not really….



-- Unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has begun starting up.

Feb 16 15:01:03 c01 podman[426007]:

Feb 16 15:01:04 c01 bash[426007]: 915d1e19fa0f213902c666371c8e825480e103f85172f3b15d1d5bf2427a87c9

Feb 16 15:01:04 c01 conmon[426038]: debug 2022-02-16T20:01:04.303+0000 7f4f72ff6440 0 deferred set uid:gid to 167:167 (ceph:ceph)

Feb 16 15:01:04 c01 conmon[426038]: debug 2022-02-16T20:01:04.303+0000 7f4f72ff6440 0 ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (st>

Feb 16 15:01:04 c01 conmon[426038]: debug 2022-02-16T20:01:04.303+0000 7f4f72ff6440 0 framework: beast

Feb 16 15:01:04 c01 conmon[426038]: debug 2022-02-16T20:01:04.303+0000 7f4f72ff6440 0 framework conf key: port, val: 80

Feb 16 15:01:04 c01 conmon[426038]: debug 2022-02-16T20:01:04.303+0000 7f4f72ff6440 1 radosgw_Main not setting numa affinity

Feb 16 15:01:04 c01 systemd[1]: Started Ceph rgw.obj0.c01.gpqshk for 35194656-893e-11ec-85c8-005056870dae.

-- Subject: Unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has finished start-up

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- Unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has finished starting up.

--

-- The start-up result is done.

Feb 16 15:01:04 c01 systemd[1]: ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service: Main process exited, code=exited, status=98/n/a

Feb 16 15:01:05 c01 systemd[1]: ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service: Failed with result 'exit-code'.

-- Subject: Unit failed

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- The unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has entered the 'failed' state with result 'exit-code'.

Feb 16 15:01:15 c01 systemd[1]: ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service: Service RestartSec=10s expired, scheduling restart.

Feb 16 15:01:15 c01 systemd[1]: ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service: Scheduled restart job, restart counter is at 5.

-- Subject: Automatic restarting of a unit has been scheduled

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- Automatic restarting of the unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has been scheduled, as the result for

-- the configured Restart= setting for the unit.

Feb 16 15:01:15 c01 systemd[1]: Stopped Ceph rgw.obj0.c01.gpqshk for 35194656-893e-11ec-85c8-005056870dae.

-- Subject: Unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has finished shutting down

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- Unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has finished shutting down.

Feb 16 15:01:15 c01 systemd[1]: ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service: Start request repeated too quickly.

Feb 16 15:01:15 c01 systemd[1]: ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service: Failed with result 'exit-code'.

-- Subject: Unit failed

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- The unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has entered the 'failed' state with result 'exit-code'.

Feb 16 15:01:15 c01 systemd[1]: Failed to start Ceph rgw.obj0.c01.gpqshk for 35194656-893e-11ec-85c8-005056870dae.

-- Subject: Unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has failed

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- Unit ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has failed.

--

-- The result is failed.



Ron Gage

Westland, MI



From: Adam King <adking@xxxxxxxxxx>
Sent: Wednesday, February 16, 2022 4:18 PM
To: Ron Gage <ron@xxxxxxxxxxx>
Cc: ceph-users <ceph-users@xxxxxxx>
Subject: Re:  Problem with Ceph daemons



Is there anything useful in the rgw daemon's logs? (e.g. journalctl -xeu ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk <mailto:ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk> )



 - Adam King



On Wed, Feb 16, 2022 at 3:58 PM Ron Gage <ron@xxxxxxxxxxx <mailto:ron@xxxxxxxxxxx> > wrote:

Hi everyone!



Looks like I am having some problems with some of my ceph RGW daemons - they
won't stay running.



From 'cephadm ls'.



{

        "style": "cephadm:v1",

        "name": "rgw.obj0.c01.gpqshk",

        "fsid": "35194656-893e-11ec-85c8-005056870dae",

        "systemd_unit":
"ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk
<mailto:ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk <mailto:ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk> > ",

        "enabled": true,

        "state": "error",

        "service_name": "rgw.obj0",

        "ports": [

            80

        ],

        "ip": null,

        "deployed_by": [


"quay.io/ceph/ceph@sha256:c3a89afac4f9c83c716af57e08863f7010318538c7e2cd9114 <http://quay.io/ceph/ceph@sha256:c3a89afac4f9c83c716af57e08863f7010318538c7e2cd911458800097f7d97d>
58800097f7d97d
<mailto:quay.io <mailto:quay.io> /ceph/ceph@sha256:c3a89afac4f9c83c716af57e08863f7010318538c7e
2cd911458800097f7d97d> ",


"quay.io/ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a7 <http://quay.io/ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a77fa32d0b903061>
7fa32d0b903061
<mailto:quay.io <mailto:quay.io> /ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76
fff41a77fa32d0b903061> "

        ],

        "rank": null,

        "rank_generation": null,

        "memory_request": null,

        "memory_limit": null,

        "container_id": null,

        "container_image_name":
"quay.io/ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a7 <http://quay.io/ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a77fa32d0b903061>
7fa32d0b903061
<mailto:quay.io <mailto:quay.io> /ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76
fff41a77fa32d0b903061> ",

        "container_image_id": null,

        "container_image_digests": null,

        "version": null,

        "started": null,

        "created": "2022-02-09T01:00:53.411541Z",

        "deployed": "2022-02-09T01:00:52.338515Z",

        "configured": "2022-02-09T01:00:53.411541Z"

    },



That whole "state: error" bit is concerning to me - and it contributing to
the cluster status of warning (showing 6 cephadm daemons down).



Can I get a hint or two on how to fix this?


Thanks!



Ron Gage

Westland, MI





_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux