Dear Mr Kefu Chai Sorry to disturb you. I meet a problem recently. In my ceph cluster ,health status has warning “store is getting too big!” for several days; and ceph-mon costs
nearly 100% cpu; Have you ever met this situation?
Some detailed information are attached below: root@cvknode17:~# ceph -s cluster 04afba60-3a77-496c-b616-2ecb5e47e141 health HEALTH_WARN
mon.cvknode17 store is getting too big! 34104 MB >= 15360 MB monmap e1: 3 mons at {cvknode15=172.16.51.15:6789/0,cvknode16=172.16.51.16:6789/0,cvknode17=172.16.51.17:6789/0} election epoch 862, quorum 0,1,2 cvknode15,cvknode16,cvknode17 osdmap e196279: 347 osds: 347 up, 347 in pgmap v5891025: 33272 pgs, 16 pools, 26944 GB data, 6822 kobjects 65966 GB used, 579 TB / 644 TB avail 33270 active+clean 2 active+clean+scrubbing+deep client io 840 kB/s rd, 739 kB/s wr, 35 op/s rd, 184 op/s wr root@cvknode17:~# top top - 15:19:28 up 23 days, 23:58, 6 users, load average: 1.08, 1.40, 1.77 Tasks: 346 total, 2 running, 342 sleeping, 0 stopped, 2 zombie Cpu(s): 8.1%us, 10.8%sy, 0.0%ni, 69.0%id, 9.5%wa, 0.0%hi, 2.5%si, 0.0%st Mem: 65384424k total, 58102880k used, 7281544k free, 240720k buffers Swap: 29999100k total, 344944k used, 29654156k free, 24274272k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24407 root 20 0 17.3g 12g 10m S 98 20.2 8420:11 ceph-mon root@cvknode17:~# top -Hp 24407 top - 15:19:49 up 23 days, 23:59, 6 users, load average: 1.12, 1.39, 1.76 Tasks: 17 total, 1 running, 16 sleeping, 0 stopped, 0 zombie Cpu(s): 8.1%us, 10.8%sy, 0.0%ni, 69.0%id, 9.5%wa, 0.0%hi, 2.5%si, 0.0%st Mem: 65384424k total, 58104868k used, 7279556k free, 240744k buffers Swap: 29999100k total, 344944k used, 29654156k free, 24271188k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25931 root 20 0 17.3g 12g 9m R 98 20.2 7957:37 ceph-mon
24514 root 20 0 17.3g 12g 9m S 2 20.2 3:06.75 ceph-mon
25932 root 20 0 17.3g 12g 9m S 2 20.2 1:07.82 ceph-mon
24407 root 20 0 17.3g 12g 9m S 0 20.2 0:00.67 ceph-mon
24508 root 20 0 17.3g 12g 9m S 0 20.2 15:50.24 ceph-mon 24513 root 20 0 17.3g 12g 9m S 0 20.2 0:07.88 ceph-mon
24534 root 20 0 17.3g 12g 9m S 0 20.2 196:33.85 ceph-mon 24535 root 20 0 17.3g 12g 9m S 0 20.2 0:00.01 ceph-mon
25929 root 20 0 17.3g 12g 9m S 0 20.2 3:06.09 ceph-mon 25930 root 20 0 17.3g 12g 9m S 0 20.2 8:12.58 ceph-mon
25933 root 20 0 17.3g 12g 9m S 0 20.2 4:42.22 ceph-mon
25934 root 20 0 17.3g 12g 9m S 0 20.2 40:53.27 ceph-mon
25935 root 20 0 17.3g 12g 9m S 0 20.2 0:04.84 ceph-mon
25936 root 20 0 17.3g 12g 9m S 0 20.2 0:00.01 ceph-mon
25980 root 20 0 17.3g 12g 9m S 0 20.2 0:06.65 ceph-mon
25986 root 20 0 17.3g 12g 9m S 0 20.2 48:26.77 ceph-mon
55738 root 20 0 17.3g 12g 9m S 0 20.2 0:09.06 ceph-mon Thread 20 (Thread 0x7f3e77e80700 (LWP 25931)): #0 0x00007f3e7e83a653 in pread64 () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00000000009286cf in ?? () #2 0x000000000092c187 in leveldb::ReadBlock(leveldb::RandomAccessFile*, leveldb::ReadOptions const&, leveldb::BlockHandle const&, leveldb::Block**) () #3 0x0000000000922f41 in leveldb::Table::BlockReader(void*, leveldb::ReadOptions const&, leveldb::Slice const&) () #4 0x0000000000924840 in ?? () #5 0x0000000000924b39 in ?? () #6 0x0000000000924a7a in ?? () #7 0x00000000009227d0 in ?? () #8 0x00000000009140b6 in ?? () #9 0x00000000009143dd in ?? () #10 0x000000000088d399 in LevelDBStore::LevelDBWholeSpaceIteratorImpl::lower_bound(std::string const&, std::string const&) () #11 0x000000000088bf00 in LevelDBStore::get(std::string const&, std::set<std::string, std::less<std::string>, std::allocator<std::string> > const&, std::map<std::string, ceph::buffer::list,
std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > >*) () #12 0x000000000056a7a2 in MonitorDBStore::get(std::string const&, std::string const&) () ---Type <return> to continue, or q <return> to quit--- #13 0x00000000005dcf61 in PaxosService::refresh(bool*) () #14 0x000000000058a76b in Monitor::refresh_from_paxos(bool*) () #15 0x00000000005c55ac in Paxos::do_refresh() () #16 0x00000000005cc093 in Paxos::handle_commit(MMonPaxos*) () #17 0x00000000005d4d8b in Paxos::dispatch(PaxosServiceMessage*) () #18 0x00000000005ac204 in Monitor::dispatch(MonSession*, Message*, bool) () #19 0x00000000005a9b09 in Monitor::_ms_dispatch(Message*) () #20 0x00000000005c48a2 in Monitor::ms_dispatch(Message*) () #21 0x00000000008b2e67 in Messenger::ms_deliver_dispatch(Message*) () #22 0x00000000008b000a in DispatchQueue::entry() () #23 0x00000000007a069d in DispatchQueue::DispatchThread::entry() () #24 0x00007f3e7e832e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #25 0x00007f3e7cff638d in clone () from /lib/x86_64-linux-gnu/libc.so.6 #26 0x0000000000000000 in ?? () Thanks Best regards 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com