How to know a OSD is super busy? Thanks. Wido den Hollander <wido@xxxxxxxx> > > > On 2/2/20 5:20 PM, Andreas John wrote: > > Hello, > > > > what you see is an stracktrace, so the OSD is hitting an unexpected > > state (Otherwise there would be an error handler). > > > > The crash happens, when the osd wants to read from pipe when processing > > heartbeat. To me it sounds like a networking issue. > > > > I see the other OSD an that host are healthy, to I would bet there is an > > issue with tcp port that this particular osd daemon used. > > > > It could also be that this OSD is so busy internally with other stuff > that it doesn't respond to heartbeats and then commits suicide. > > Combined with the comment that VMs can't read their data it could very > well be that the OSD is super busy. > > Maybe try a compact of the LevelDB database. > > Wido > > > Try 'netstat -tulpen' or so to check. If there a firewall between the > > host that might cut something off? > > > > > > rgds, > > > > j. > > > > > > On 02.02.20 04:08, Makito Nishimiya wrote: > >> Hi. > >> > >> This is the cluster informastion. > >> > >> > >> -------------- /var/log/ceph/ceph.osd.1.log --------------------------- > >> 2020-02-01 03:47:20.635504 7f86f4e40700 1 heartbeat_map is_healthy > >> 'OSD::osd_op_tp thread 0x7f86fe35e700' had timed out after 15 > >> 2020-02-01 03:47:20.635521 7f86f4f41700 1 heartbeat_map is_healthy > >> 'OSD::osd_op_tp thread 0x7f86fe35e700' had timed out after 15 > >> 2020-02-01 03:47:20.747876 7f86f3b2d700 1 heartbeat_map is_healthy > >> 'OSD::osd_op_tp thread 0x7f86fe35e700' had timed out after 15 > >> 2020-02-01 03:47:20.747903 7f86f3c2e700 1 heartbeat_map is_healthy > >> 'OSD::osd_op_tp thread 0x7f86fe35e700' had timed out after 15 > >> 2020-02-01 03:47:21.152436 7f86e2a2e700 1 heartbeat_map is_healthy > >> 'OSD::osd_op_tp thread 0x7f86fe35e700' had timed out after 15 > >> 2020-02-01 03:47:21.152441 7f86e2a2e700 1 heartbeat_map is_healthy > >> 'OSD::osd_op_tp thread 0x7f86fe35e700' had suicide timed out after 150 > >> 2020-02-01 03:47:21.157963 7f86e2a2e700 -1 common/HeartbeatMap.cc: In > >> function 'bool ceph::HeartbeatMap::_check(const > >> ceph::heartbeat_handle_d*, const char*, time_t)' thread 7f86e2a2e700 > >> time 2020-02-01 03:47:21.152463 > >> common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit suicide timeout") > >> > >> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) > >> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > >> const*)+0x85) [0x7f875259f425] > >> 2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char > >> const*, long)+0x2e1) [0x7f87524dede1] > >> 3: (ceph::HeartbeatMap::is_healthy()+0xde) [0x7f87524df63e] > >> 4: (OSD::handle_osd_ping(MOSDPing*)+0x93f) [0x7f8751f141df] > >> 5: (OSD::heartbeat_dispatch(Message*)+0x3cb) [0x7f8751f1540b] > >> 6: (DispatchQueue::fast_dispatch(Message*)+0x76) [0x7f875265f9d6] > >> 7: (Pipe::reader()+0x1dff) [0x7f875269c68f] > >> 8: (Pipe::Reader::entry()+0xd) [0x7f87526a41ad] > >> 9: (()+0x7e25) [0x7f87502b8e25] > >> 10: (clone()+0x6d) [0x7f874e94234d] > >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is > >> needed to interpret this. > >> > >> --- begin dump of recent events --- > >> -10000> 2020-02-01 03:45:29.491110 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.202.203:6805/5717 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f877512a200 con > >> 0x7f8774eb6e80 > >> -9999> 2020-02-01 03:45:29.491109 7f86e8245700 1 -- > >> 10.1.201.201:0/3892596 <== osd.9 10.1.201.202:6811/3006646 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f8781383400 con 0x7f8774eaf480 > >> -9998> 2020-02-01 03:45:29.491123 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.201.203:6803/5220 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f877512bc00 con > >> 0x7f8774eae100 > >> -9997> 2020-02-01 03:45:29.491137 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.202.203:6803/5220 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f877512aa00 con > >> 0x7f8774eae280 > >> -9996> 2020-02-01 03:45:29.491150 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.201.203:6811/15624 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f877512ac00 con > >> 0x7f8774eae700 > >> -9995> 2020-02-01 03:45:29.491148 7f86ec1a4700 1 -- > >> 10.1.201.201:0/3892596 <== osd.8 10.1.202.202:6812/2131331 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f8781380200 con 0x7f8774eb5200 > >> -9994> 2020-02-01 03:45:29.491163 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.202.203:6811/15624 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f877512be00 con > >> 0x7f8774eae880 > >> -9993> 2020-02-01 03:45:29.491176 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.201.203:6801/4089 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f8775128a00 con > >> 0x7f8774eaf780 > >> -9992> 2020-02-01 03:45:29.491169 7f86e8447700 1 -- > >> 10.1.201.201:0/3892596 <== osd.9 10.1.202.202:6811/3006646 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f8781380400 con 0x7f8774eaf600 > >> -9991> 2020-02-01 03:45:29.491167 7f86e975a700 1 -- > >> 10.1.201.201:0/3892596 <== osd.10 10.1.201.202:6801/3005449 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f8781380000 con 0x7f8774eb7c00 > >> -9990> 2020-02-01 03:45:29.491192 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.202.203:6801/4089 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f8775128c00 con > >> 0x7f8774eaf900 > >> -9989> 2020-02-01 03:45:29.491206 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.201.203:6813/17285 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f8775451e00 con > >> 0x7f8774ee3180 > >> -9988> 2020-02-01 03:45:29.491184 7f86e9255700 1 -- > >> 10.1.201.201:0/3892596 <== osd.11 10.1.201.202:6809/2004102 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f8781385800 con 0x7f8774eade00 > >> -9987> 2020-02-01 03:45:29.491202 7f86e9558700 1 -- > >> 10.1.201.201:0/3892596 <== osd.10 10.1.202.202:6801/3005449 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f8781385400 con 0x7f8774eb7d80 > >> -9986> 2020-02-01 03:45:29.491233 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.202.203:6813/17285 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f877544fe00 con > >> 0x7f8774ee3300 > >> -9985> 2020-02-01 03:45:29.491238 7f86e9053700 1 -- > >> 10.1.201.201:0/3892596 <== osd.11 10.1.202.202:6813/2004102 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f8781383800 con 0x7f8774eadf80 > >> -9984> 2020-02-01 03:45:29.491234 7f86e874a700 1 -- > >> 10.1.201.201:0/3892596 <== osd.12 10.1.201.202:6803/4953 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f8781383a00 con 0x7f8774eae400 > >> -9983> 2020-02-01 03:45:29.491247 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.201.204:6807/13468 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f8775576000 con > >> 0x7f8774eb5e00 > >> -9982> 2020-02-01 03:45:29.491258 7f86ecbae700 1 -- > >> 10.1.201.201:0/3892596 <== osd.13 10.1.201.202:6805/2390794 12 ==== > >> osd_ping(ping_reply e126359 stamp 2020-02-01 03:45:29.490356) v2 ==== > >> 47+0+0 (4038678352 0 0) 0x7f87813a6400 con 0x7f8774ee1980 > >> -9981> 2020-02-01 03:45:29.491266 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.202.204:6807/13468 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f8775576e00 con > >> 0x7f8774eb5f80 > >> -9980> 2020-02-01 03:45:29.491291 7f86f9253700 1 -- > >> 10.1.201.201:0/3892596 --> 10.1.201.204:6809/410784 -- osd_ping(ping > >> e126357 stamp 2020-02-01 03:45:29.490356) v2 -- ?+0 0x7f8774ffe400 con > >> 0x7f8774eac900 > >> > >> ---------------- ceph -s > >> ------------------------------------------------------ > >> > >> cluster 57d2e172-0306-4f8a-946c-a4ae1e95b26b > >> health HEALTH_WARN > >> 7 pgs backfill_wait > >> 11 pgs backfilling > >> 9 pgs degraded > >> 4 pgs recovery_wait > >> 21 pgs stuck unclean > >> 6 pgs undersized > >> 100 requests are blocked > 32 sec > >> recovery 649692/75245727 objects degraded (0.863%) > >> recovery 1027694/75245727 objects misplaced (1.366%) > >> recovery 1/24998284 unfound (0.000%) > >> pool default.rgw.buckets.data has many more objects per pg > >> than average (too few pgs?) > >> monmap e1: 3 mons at > >> {cephmon01= > 10.1.202.199:6789/0,cephmon02=10.1.202.198:6789/0,cephmon03=10.1.202.197:6789/0 > } > >> election epoch 562, quorum 0,1,2 > >> cephmon03,cephmon02,cephmon01 > >> osdmap e127248: 42 osds: 37 up, 37 in; 20 remapped pgs > >> flags sortbitwise,require_jewel_osds > >> pgmap v92257695: 1568 pgs, 14 pools, 18271 GB data, 24412 kobjects > >> 55189 GB used, 45363 GB / 100553 GB avail > >> 649692/75245727 objects degraded (0.863%) > >> 1027694/75245727 objects misplaced (1.366%) > >> 1/24998284 unfound (0.000%) > >> 1540 active+clean > >> 8 active+remapped+backfilling > >> 5 active+remapped+wait_backfill > >> 3 active+clean+scrubbing > >> 3 active+clean+scrubbing+deep > >> 3 active+undersized+degraded+remapped+backfilling > >> 2 active+undersized+degraded+remapped+wait_backfill > >> 2 active+recovery_wait+degraded > >> 1 active+recovery_wait+undersized+degraded+remapped > >> 1 active+recovery_wait+degraded+remapped > >> recovery io 239 MB/s, 187 objects/s > >> client io 575 kB/s wr, 0 op/s rd, 37 op/s wr > >> > >> ---------------- ceph osd tree > >> ------------------------------------------------------ > >> > >> [root@ceph01 ceph]# ceph osd tree > >> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > >> -1 108.19864 root default > >> -2 19.09381 host ceph01 > >> 0 2.72769 osd.0 up 1.00000 1.00000 > >> 1 2.72769 osd.1 down 0 1.00000 <-- now > >> down > >> 2 2.72769 osd.2 up 1.00000 1.00000 > >> 5 2.72769 osd.5 up 1.00000 1.00000 > >> 6 2.72768 osd.6 up 1.00000 1.00000 > >> 3 2.72768 osd.3 up 1.00000 1.00000 > >> 4 2.72769 osd.4 up 1.00000 1.00000 > >> -3 19.09383 host ceph02 > >> 8 2.72769 osd.8 up 1.00000 1.00000 > >> 9 2.72769 osd.9 up 1.00000 1.00000 > >> 10 2.72769 osd.10 up 1.00000 1.00000 > >> 12 2.72769 osd.12 up 1.00000 1.00000 > >> 11 2.72769 osd.11 up 1.00000 1.00000 > >> 7 2.72768 osd.7 up 1.00000 1.00000 > >> 13 2.72769 osd.13 up 1.00000 1.00000 > >> -4 16.36626 host ceph03 > >> 14 2.72769 osd.14 up 1.00000 1.00000 > >> 16 2.72769 osd.16 up 1.00000 1.00000 > >> 17 2.72769 osd.17 up 1.00000 1.00000 > >> 19 2.72769 osd.19 up 1.00000 1.00000 > >> 15 1.81850 osd.15 up 1.00000 1.00000 > >> 18 1.81850 osd.18 up 1.00000 1.00000 > >> 20 1.81850 osd.20 up 1.00000 1.00000 > >> -5 15.45706 host ceph04 > >> 23 2.72769 osd.23 up 1.00000 1.00000 > >> 24 2.72769 osd.24 up 1.00000 1.00000 > >> 27 2.72769 osd.27 down 0 1.00000 <-- > >> more then 3month ago > >> 21 1.81850 osd.21 up 1.00000 1.00000 > >> 22 1.81850 osd.22 up 1.00000 1.00000 > >> 25 1.81850 osd.25 up 1.00000 1.00000 > >> 26 1.81850 osd.26 up 1.00000 1.00000 > >> -6 19.09384 host ceph05 > >> 28 2.72769 osd.28 up 1.00000 1.00000 > >> 29 2.72769 osd.29 up 1.00000 1.00000 > >> 30 2.72769 osd.30 up 1.00000 1.00000 > >> 31 2.72769 osd.31 down 0 1.00000 <-- > >> more then 3month ago > >> 32 2.72769 osd.32 up 1.00000 1.00000 > >> 34 2.72769 osd.34 up 1.00000 1.00000 > >> 33 2.72769 osd.33 down 0 1.00000 <-- > >> more then 3month ago > >> -7 19.09384 host ceph06 > >> 35 2.72769 osd.35 up 1.00000 1.00000 > >> 36 2.72769 osd.36 up 1.00000 1.00000 > >> 37 2.72769 osd.37 up 1.00000 1.00000 > >> 39 2.72769 osd.39 up 1.00000 1.00000 > >> 40 2.72769 osd.40 up 1.00000 1.00000 > >> 41 2.72769 osd.41 up 1.00000 1.00000 > >> 38 2.72769 osd.38 down 0 1.00000 <-- > >> more then 3month ago > >> > >> > >> ------------------------------ > >> > >> On 2020/02/02 11:20, 西宮 牧人 wrote: > >>> Servers: 6 (include 7osds) total 42osdsl > >>> OS: Centos7 > >>> Ceph: 10.2.5 > >>> > >>> Hi, everyone > >>> > >>> The cluster is used for VM image storage and object storage. > >>> And I have a bucket which has more than 20 million objects. > >>> > >>> Now, I have a problem that cluster blocks operation. > >>> > >>> Suddenly cluster blocked operations, then VMs can't read disk. > >>> After a few hours, osd.1 was down. > >>> > >>> There is no disk fail messages in dmesg. > >>> And no error is in smartctl -a /dev/sde. > >>> > >>> I tried to wake up osd.1, but osd.1 is down soon. > >>> Just after re-waking up osd.1, VM can access to the disk. > >>> But osd.1 always uses 100% CPU, then cluster marked osd.1 down and > >>> the osd was dead by suicide timeout. > >>> > >>> I found that the osdmap epoch of osd.1 is different from other one. > >>> So I think osd.1 was dead. > >>> > >>> > >>> Question. > >>> (1) Why does the epoch of osd.1 differ from other osds ones ? > >>> > >>> I checked all osds oldest_map and newest_map by ~ceph daemon osd.X > >>> status~ > >>> All osd's ecpoch are same number except osd.1 > >>> > >>> (2) Why does osd.1 use CPU full? > >>> > >>> After the cluster marked osd.1 down, osd.1 keeps up busy. > >>> When I execute "ceph tell osd.1 injectargs --debug-ms 5/1", osd.1 > >>> doesn't answer. > >>> > >>> > >>> Thank you. > >> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx