radosgw stopped working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

for some reason radosgw stopped working.

Cluster status:
[root@ctplmon1 ~]# ceph -v
ceph version 17.2.8 (f817ceb7f187defb1d021d6328fa833eb8e943b3) quincy
(stable)
[root@ctplmon1 ~]# ceph -s
  cluster:
    id:     0a6e5422-ac75-4093-af20-528ee00cc847
    health: HEALTH_ERR
            6 OSD(s) experiencing slow operations in BlueStore
            2 backfillfull osd(s)
            1 full osd(s)
            1 nearfull osd(s)
            Low space hindering backfill (add storage if this doesn't
resolve itself): 32 pgs backfill_toofull
            Degraded data redundancy: 835306/1285383707 objects degraded
(0.065%), 6 pgs degraded, 5 pgs undersized
            76 pgs not deep-scrubbed in time
            45 pgs not scrubbed in time
            Full OSDs blocking recovery: 1 pg recovery_toofull
            9 pool(s) full
            9 daemons have recently crashed

  services:
    mon: 3 daemons, quorum ctplmon1,ctplmon3,ctplmon2 (age 36m)
    mgr: ctplmon1(active, since 65m)
    mds: 1/1 daemons up
    osd: 193 osds: 191 up (since 8m), 191 in (since 9m); 267 remapped pgs
    rgw: 2 daemons active (1 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   10 pools, 793 pgs
    objects: 257.08M objects, 292 TiB
    usage:   614 TiB used, 386 TiB / 1000 TiB avail
    pgs:     835306/1285383707 objects degraded (0.065%)
             225512620/1285383707 objects misplaced (17.544%)
             525 active+clean
             230 active+remapped+backfilling
             32  active+remapped+backfill_toofull
             5   active+undersized+degraded+remapped+backfilling
             1   active+recovery_toofull+degraded

  io:
    recovery: 978 MiB/s, 825 objects/s

---

Do not know if it is related but the cluster has been rebalancing for a few
days now, after I've set EC pool only to use hdd.

---

If I start rgw with debug I get something like this in logs:
[root@ctplmon2 ~]# radosgw -c /etc/ceph/ceph.conf --setuser ceph --setgroup
ceph -n client.radosgw.moja.shramba.ctplmon2 -f -m 194.249.4.104:6789
--debug-rgw=99/99
2024-12-21T23:21:59.898+0100 7f659e380640 -1 Initialization timeout, failed
to initialize

In logs I get:
2024-12-21T23:16:59.898+0100 7f65a19257c0  0 deferred set uid:gid to
167:167 (ceph:ceph)
2024-12-21T23:16:59.898+0100 7f65a19257c0  0 ceph version 17.2.8
(f817ceb7f187defb1d021d6328fa833eb8e943b3) quincy (stable), process
radosgw, pid 168935
2024-12-21T23:16:59.898+0100 7f65a19257c0  0 framework: beast
2024-12-21T23:16:59.898+0100 7f65a19257c0  0 framework conf key: port, val:
4444
2024-12-21T23:16:59.898+0100 7f65a19257c0  1 radosgw_Main not setting numa
affinity
2024-12-21T23:16:59.901+0100 7f65a19257c0  1 rgw_d3n:
rgw_d3n_l1_local_datacache_enabled=0
2024-12-21T23:16:59.901+0100 7f65a19257c0  1 D3N datacache enabled: 0
2024-12-21T23:16:59.901+0100 7f658dffb640 20 reqs_thread_entry: start
2024-12-21T23:16:59.901+0100 7f658d7fa640 10 entry start
2024-12-21T23:16:59.908+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:16:59.914+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:16:59.914+0100 7f65a19257c0 20 rgw main: realm
2024-12-21T23:16:59.914+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:16:59.915+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:16:59.915+0100 7f65a19257c0  4 rgw main: RGWPeriod::init
failed to init realm  id  : (2) No such file or directory
2024-12-21T23:16:59.915+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:16:59.915+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:16:59.915+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:16:59.917+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=46
2024-12-21T23:16:59.917+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:16:59.945+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=873
2024-12-21T23:16:59.945+0100 7f65a19257c0 20 rgw main: searching for the
correct realm
2024-12-21T23:17:00.210+0100 7f65a19257c0 20 rgw main:
RGWRados::pool_iterate: got zone_info.c2c70444-7a41-4acd-a0d0-9f87d324ec72
2024-12-21T23:17:00.210+0100 7f65a19257c0 20 rgw main:
RGWRados::pool_iterate: got
zonegroup_info.b1e0d55c-f7cb-4e73-b1cb-6cffa1fd6578
2024-12-21T23:17:00.210+0100 7f65a19257c0 20 rgw main:
RGWRados::pool_iterate: got zone_names.default
2024-12-21T23:17:00.210+0100 7f65a19257c0 20 rgw main:
RGWRados::pool_iterate: got zonegroups_names.default
2024-12-21T23:17:00.210+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.210+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:17:00.210+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.211+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=46
2024-12-21T23:17:00.211+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.212+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=358
2024-12-21T23:17:00.212+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.213+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:17:00.213+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.214+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:17:00.214+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.215+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=46
2024-12-21T23:17:00.284+0100 7f65a19257c0 20 rgw main:
RGWRados::pool_iterate: got zone_info.c2c70444-7a41-4acd-a0d0-9f87d324ec72
2024-12-21T23:17:00.284+0100 7f65a19257c0 20 rgw main:
RGWRados::pool_iterate: got
zonegroup_info.b1e0d55c-f7cb-4e73-b1cb-6cffa1fd6578
2024-12-21T23:17:00.284+0100 7f65a19257c0 20 rgw main:
RGWRados::pool_iterate: got zone_names.default
2024-12-21T23:17:00.284+0100 7f65a19257c0 20 rgw main:
RGWRados::pool_iterate: got zonegroups_names.default
2024-12-21T23:17:00.284+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.285+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=46
2024-12-21T23:17:00.285+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.286+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=873
2024-12-21T23:17:00.286+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.287+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=46
2024-12-21T23:17:00.287+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.293+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=0 bl.length=358
2024-12-21T23:17:00.293+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.295+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:17:00.295+0100 7f65a19257c0 20 zone default found
2024-12-21T23:17:00.295+0100 7f65a19257c0  4 rgw main: Realm:
           ()
2024-12-21T23:17:00.295+0100 7f65a19257c0  4 rgw main: ZoneGroup: default
           (b1e0d55c-f7cb-4e73-b1cb-6cffa1fd6578)
2024-12-21T23:17:00.295+0100 7f65a19257c0  4 rgw main: Zone:      default
           (c2c70444-7a41-4acd-a0d0-9f87d324ec72)
2024-12-21T23:17:00.295+0100 7f65a19257c0 10 cannot find current period
zonegroup using local zonegroup configuration
2024-12-21T23:17:00.295+0100 7f65a19257c0 20 rgw main: zonegroup default
2024-12-21T23:17:00.295+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.296+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:17:00.296+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.299+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:17:00.299+0100 7f65a19257c0 20 rgw main: rados->read ofs=0
len=0
2024-12-21T23:17:00.303+0100 7f65a19257c0 20 rgw main: rados_obj.operate()
r=-2 bl.length=0
2024-12-21T23:17:00.303+0100 7f65a19257c0 20 rgw main: started sync module
instance, tier type =
2024-12-21T23:17:00.303+0100 7f65a19257c0 20 rgw main: started zone
id=c2c70444-7a41-4acd-a0d0-9f87d324ec72 (name=default) with tier type =
2024-12-21T23:21:59.898+0100 7f659e380640 -1 Initialization timeout, failed
to initialize

---

Any ideas what might cause rgw to stop working?

Kind regards,
Rok
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux