> 2 active+clean+scrubbing+deep * Set noscrub and nodeep-scrub # ceph osd set noscrub # ceph osd set nodeep-scrub * Wait for scrubbing+deep to complete * Do `ceph -s` If still you would be seeing high CPU usage, please identify who is/are eating CPU resource. * ps aux | sort -rk 3,4 | head -n 20 And let us know. On Mon, Feb 13, 2017 at 9:39 PM, Chenyehua <chen.yehua@xxxxxxx> wrote: > Thanks for the response, Shinobu > The warning disappears due to your suggesting solution, however the nearly 100% cpu cost still exists and concerns me a lot. > So, do you know why the cpu cost is so high? > Are there any solutions or suggestions to this problem? > > Cheers > > -----邮件原件----- > 发件人: Shinobu Kinjo [mailto:skinjo@xxxxxxxxxx] > 发送时间: 2017年2月13日 10:54 > 收件人: chenyehua 11692 (RD) > 抄送: kchai@xxxxxxxxxx; ceph-users@xxxxxxxxxxxxxx > 主题: Re: 答复: mon is stuck in leveldb and costs nearly 100% cpu > > O.k, that's reasonable answer. Would you do on all hosts which the MON are running on: > > #* ceph --admin-daemon /var/run/ceph/ceph-mon.`hostname -s`.asok config show | grep leveldb_log > > Anyway you can compact leveldb size with at runtime: > > #* ceph tell mon.`hostname -s` compact > > And you should set in ceph.conf to prevent same issue from the next: > > #* [mon] > #* mon compact on start = true > > > On Mon, Feb 13, 2017 at 11:37 AM, Chenyehua <chen.yehua@xxxxxxx> wrote: >> Sorry, I made a mistake, the ceph version is actually 0.94.5 >> >> -----邮件原件----- >> 发件人: chenyehua 11692 (RD) >> 发送时间: 2017年2月13日 9:40 >> 收件人: 'Shinobu Kinjo' >> 抄送: kchai@xxxxxxxxxx; ceph-users@xxxxxxxxxxxxxx >> 主题: 答复: mon is stuck in leveldb and costs nearly 100% cpu >> >> My ceph version is 10.2.5 >> >> -----邮件原件----- >> 发件人: Shinobu Kinjo [mailto:skinjo@xxxxxxxxxx] >> 发送时间: 2017年2月12日 13:12 >> 收件人: chenyehua 11692 (RD) >> 抄送: kchai@xxxxxxxxxx; ceph-users@xxxxxxxxxxxxxx >> 主题: Re: mon is stuck in leveldb and costs nearly 100% cpu >> >> Which Ceph version are you using? >> >> On Sat, Feb 11, 2017 at 5:02 PM, Chenyehua <chen.yehua@xxxxxxx> wrote: >>> Dear Mr Kefu Chai >>> >>> Sorry to disturb you. >>> >>> I meet a problem recently. In my ceph cluster ,health status has >>> warning “store is getting too big!” for several days; and ceph-mon >>> costs nearly 100% cpu; >>> >>> Have you ever met this situation? >>> >>> Some detailed information are attached below: >>> >>> >>> >>> root@cvknode17:~# ceph -s >>> >>> cluster 04afba60-3a77-496c-b616-2ecb5e47e141 >>> >>> health HEALTH_WARN >>> >>> mon.cvknode17 store is getting too big! 34104 MB >= 15360 >>> MB >>> >>> monmap e1: 3 mons at >>> {cvknode15=172.16.51.15:6789/0,cvknode16=172.16.51.16:6789/0,cvknode1 >>> 7 >>> =172.16.51.17:6789/0} >>> >>> election epoch 862, quorum 0,1,2 >>> cvknode15,cvknode16,cvknode17 >>> >>> osdmap e196279: 347 osds: 347 up, 347 in >>> >>> pgmap v5891025: 33272 pgs, 16 pools, 26944 GB data, 6822 >>> kobjects >>> >>> 65966 GB used, 579 TB / 644 TB avail >>> >>> 33270 active+clean >>> >>> 2 active+clean+scrubbing+deep >>> >>> client io 840 kB/s rd, 739 kB/s wr, 35 op/s rd, 184 op/s wr >>> >>> >>> >>> root@cvknode17:~# top >>> >>> top - 15:19:28 up 23 days, 23:58, 6 users, load average: 1.08, >>> 1.40, >>> 1.77 >>> >>> Tasks: 346 total, 2 running, 342 sleeping, 0 stopped, 2 zombie >>> >>> Cpu(s): 8.1%us, 10.8%sy, 0.0%ni, 69.0%id, 9.5%wa, 0.0%hi, >>> 2.5%si, 0.0%st >>> >>> Mem: 65384424k total, 58102880k used, 7281544k free, 240720k buffers >>> >>> Swap: 29999100k total, 344944k used, 29654156k free, 24274272k cached >>> >>> >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> >>> 24407 root 20 0 17.3g 12g 10m S 98 20.2 8420:11 ceph-mon >>> >>> >>> >>> root@cvknode17:~# top -Hp 24407 >>> >>> top - 15:19:49 up 23 days, 23:59, 6 users, load average: 1.12, >>> 1.39, >>> 1.76 >>> >>> Tasks: 17 total, 1 running, 16 sleeping, 0 stopped, 0 zombie >>> >>> Cpu(s): 8.1%us, 10.8%sy, 0.0%ni, 69.0%id, 9.5%wa, 0.0%hi, >>> 2.5%si, 0.0%st >>> >>> Mem: 65384424k total, 58104868k used, 7279556k free, 240744k buffers >>> >>> Swap: 29999100k total, 344944k used, 29654156k free, 24271188k cached >>> >>> >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> >>> 25931 root 20 0 17.3g 12g 9m R 98 20.2 7957:37 ceph-mon >>> >>> 24514 root 20 0 17.3g 12g 9m S 2 20.2 3:06.75 ceph-mon >>> >>> 25932 root 20 0 17.3g 12g 9m S 2 20.2 1:07.82 ceph-mon >>> >>> 24407 root 20 0 17.3g 12g 9m S 0 20.2 0:00.67 ceph-mon >>> >>> 24508 root 20 0 17.3g 12g 9m S 0 20.2 15:50.24 ceph-mon >>> >>> 24513 root 20 0 17.3g 12g 9m S 0 20.2 0:07.88 ceph-mon >>> >>> 24534 root 20 0 17.3g 12g 9m S 0 20.2 196:33.85 ceph-mon >>> >>> 24535 root 20 0 17.3g 12g 9m S 0 20.2 0:00.01 ceph-mon >>> >>> 25929 root 20 0 17.3g 12g 9m S 0 20.2 3:06.09 ceph-mon >>> >>> 25930 root 20 0 17.3g 12g 9m S 0 20.2 8:12.58 ceph-mon >>> >>> 25933 root 20 0 17.3g 12g 9m S 0 20.2 4:42.22 ceph-mon >>> >>> 25934 root 20 0 17.3g 12g 9m S 0 20.2 40:53.27 ceph-mon >>> >>> 25935 root 20 0 17.3g 12g 9m S 0 20.2 0:04.84 ceph-mon >>> >>> 25936 root 20 0 17.3g 12g 9m S 0 20.2 0:00.01 ceph-mon >>> >>> 25980 root 20 0 17.3g 12g 9m S 0 20.2 0:06.65 ceph-mon >>> >>> 25986 root 20 0 17.3g 12g 9m S 0 20.2 48:26.77 ceph-mon >>> >>> 55738 root 20 0 17.3g 12g 9m S 0 20.2 0:09.06 ceph-mon >>> >>> >>> >>> >>> >>> Thread 20 (Thread 0x7f3e77e80700 (LWP 25931)): >>> >>> #0 0x00007f3e7e83a653 in pread64 () from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> >>> #1 0x00000000009286cf in ?? () >>> >>> #2 0x000000000092c187 in >>> leveldb::ReadBlock(leveldb::RandomAccessFile*, >>> leveldb::ReadOptions const&, leveldb::BlockHandle const&, >>> leveldb::Block**) >>> () >>> >>> #3 0x0000000000922f41 in leveldb::Table::BlockReader(void*, >>> leveldb::ReadOptions const&, leveldb::Slice const&) () >>> >>> #4 0x0000000000924840 in ?? () >>> >>> #5 0x0000000000924b39 in ?? () >>> >>> #6 0x0000000000924a7a in ?? () >>> >>> #7 0x00000000009227d0 in ?? () >>> >>> #8 0x00000000009140b6 in ?? () >>> >>> #9 0x00000000009143dd in ?? () >>> >>> #10 0x000000000088d399 in >>> LevelDBStore::LevelDBWholeSpaceIteratorImpl::lower_bound(std::string >>> const&, std::string const&) () >>> >>> #11 0x000000000088bf00 in LevelDBStore::get(std::string const&, >>> std::set<std::string, std::less<std::string>, >>> std::allocator<std::string> > const&, std::map<std::string, >>> ceph::buffer::list, std::less<std::string>, >>> std::allocator<std::pair<std::string const, ceph::buffer::list> > >*) >>> () >>> >>> #12 0x000000000056a7a2 in MonitorDBStore::get(std::string const&, >>> std::string const&) () >>> >>> ---Type <return> to continue, or q <return> to quit--- >>> >>> #13 0x00000000005dcf61 in PaxosService::refresh(bool*) () >>> >>> #14 0x000000000058a76b in Monitor::refresh_from_paxos(bool*) () >>> >>> #15 0x00000000005c55ac in Paxos::do_refresh() () >>> >>> #16 0x00000000005cc093 in Paxos::handle_commit(MMonPaxos*) () >>> >>> #17 0x00000000005d4d8b in Paxos::dispatch(PaxosServiceMessage*) () >>> >>> #18 0x00000000005ac204 in Monitor::dispatch(MonSession*, Message*, >>> bool) () >>> >>> #19 0x00000000005a9b09 in Monitor::_ms_dispatch(Message*) () >>> >>> #20 0x00000000005c48a2 in Monitor::ms_dispatch(Message*) () >>> >>> #21 0x00000000008b2e67 in Messenger::ms_deliver_dispatch(Message*) () >>> >>> #22 0x00000000008b000a in DispatchQueue::entry() () >>> >>> #23 0x00000000007a069d in DispatchQueue::DispatchThread::entry() () >>> >>> #24 0x00007f3e7e832e9a in start_thread () from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> >>> #25 0x00007f3e7cff638d in clone () from >>> /lib/x86_64-linux-gnu/libc.so.6 >>> >>> #26 0x0000000000000000 in ?? () >>> >>> >>> >>> >>> >>> Thanks >>> >>> Best regards >>> >>> >>> >>> --------------------------------------------------------------------- >>> - >>> --------------------------------------------------------------- >>> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 >>> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 >>> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 >>> 邮件! >>> This e-mail and its attachments contain confidential information from >>> H3C, which is intended only for the person or entity whose address is >>> listed above. Any use of the information contained herein in any way >>> (including, but not limited to, total or partial disclosure, >>> reproduction, or dissemination) by persons other than the intended >>> recipient(s) is prohibited. If you receive this e-mail in error, >>> please notify the sender by phone or email immediately and delete it! >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com