Hi Igor, Sorry for my late reply, because I got some important thing to finish. The enviroment information is as follows: 1) Sorry, you mean the disk type ? 2) ceph version is 12.2.5; 3) the file 117.perf is the perf dump information: 117.perf 4) "ceph-osd.117.osd.bluefs2" is the log I collected. ( search for "5.1af deep-scrub starts" in this file , it begin with the line21789) ceph-osd.117.log.bluefs2 Best wishes, songbo Igor Fedotov <ifedotov@xxxxxxx> 于2019年2月28日周四 下午5:49写道: > > Hi Wang, > > there is somewhat similar issue in the following ticket: > > http://tracker.ceph.com/issues/36482 > > > Also we're currently troubleshooting something similar too. Will update > in the above ticket. > > > Could you please provide a bit more information on your cluster: > > 1) what drives are at this OSD? > > 2) what's Ceph version? > > 3) Could you please dump performance counters for this specific OSD? > > 4) Could you please raise 'debug bluefs' to 20 and force the issue to > happen and collect the logs. > > > Thanks, > > Igor > > On 2/28/2019 11:20 AM, Songbo Wang wrote: > > Hi guys, > > > > In my production environment, when deep-srub runned on some empty pgs, > > the osd will become down. And I found that the ops on rocksdb took too > > long time, which will lead the thread become unhealthy. And no ping > > request sended to the peer osds. Then the monitor got reports from the > > peer osd and mark this osd down. > > > > The following was the log from the primary osd. > > > > 2019-02-27 11:00:00.604029 7f9592f84700 20 osd.92 pg_epoch: 13415 > > pg[5.1bf( empty local-lis/les=13413/13414 n=0 ec=175/175 lis/c > > 13413/13413 les/c/f 13414/13414/0 13413/13413/13413) [92,142,111] r=0 > > lpr=13413 crt=0'0 mlcod 0'0 active+clean+scrubbing+deep] scrub state > > INACTIVE [MIN,MIN) > > 2019-02-27 11:00:00.604168 7f9592f84700 20 osd.92 pg_epoch: 13415 > > pg[5.1bf( empty local-lis/les=13413/13414 n=0 ec=175/175 lis/c > > 13413/13413 les/c/f 13414/13414/0 13413/13413/13413) [92,142,111] r=0 > > lpr=13413 crt=0'0 mlcod 0'0 active+clean+scrubbing+deep] scrub state > > NEW_CHUNK [5:fd800000::::head,MIN) > > 2019-02-27 11:00:00.604189 7f9592f84700 15 > > bluestore(/var/lib/ceph/osd/ceph-92) collection_list 5.1bf_head start > > #5:fd800000::::head#0 end GHMAX max 5 > > 2019-02-27 11:00:00.604195 7f9592f84700 20 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list range > > 0x7f7ffffffffffffff9fd800000 to 0x7f7ffffffffffffff9fe000000 and > > 0x7f8000000000000005fd800000 to 0x7f8000000000000005fe000000 start > > #5:fd800000::::head#0 > > 2019-02-27 11:00:00.611606 7f9592f84700 20 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list pend > > 0x7f8000000000000005fe000000 > > 2019-02-27 11:00:00.611617 7f9592f84700 30 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list key > > 0x7f8000000000000005fd80000021213dfffffffffffffffeffffffffffffffff'o' > > 2019-02-27 11:00:00.611621 7f9592f84700 20 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list oid > > #5:fd800000::::head# end GHMAX > > 2019-02-27 11:00:26.142032 7f9592f84700 20 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list key > > 0x7f800000000000001d04c0000021213dfffffffffffffffeffffffffffffffff'o' > >> = GHMAX > > 2019-02-27 11:00:26.142077 7f9592f84700 10 > > bluestore(/var/lib/ceph/osd/ceph-92) collection_list 5.1bf_head start > > GHMAX end GHMAX max 5 = 0, ls.size() = 1, next = GHMAX > > 2019-02-27 11:00:26.142245 7f9592f84700 20 osd.92 pg_epoch: 13415 > > pg[5.1bf( empty local-lis/les=13413/13414 n=0 ec=175/175 lis/c > > 13413/13413 les/c/f 13414/13414/0 13413/13413/13413) [92,142,111] r=0 > > lpr=13413 crt=0'0 mlcod 0'0 active+clean+scrubbing+deep] scrub state > > WAIT_PUSHES [5:fd800000::::head,MAX) > > 2019-02-27 11:00:26.142258 7f9592f84700 20 osd.92 pg_epoch: 13415 > > pg[5.1bf( empty local-lis/les=13413/13414 n=0 ec=175/175 lis/c > > 13413/13413 les/c/f 13414/13414/0 13413/13413/13413) [92,142,111] r=0 > > lpr=13413 crt=0'0 mlcod 0'0 active+clean+scrubbing+deep] scrub state > > WAIT_LAST_UPDATE [5:fd800000::::head,MAX) > > 2019-02-27 11:00:26.142267 7f9592f84700 20 osd.92 pg_epoch: 13415 > > pg[5.1bf( empty local-lis/les=13413/13414 n=0 ec=175/175 lis/c > > 13413/13413 les/c/f 13414/13414/0 13413/13413/13413) [92,142,111] r=0 > > lpr=13413 crt=0'0 mlcod 0'0 active+clean+scrubbing+deep] scrub state > > BUILD_MAP [5:fd800000::::head,MAX) > > 2019-02-27 11:00:26.142286 7f9592f84700 15 > > bluestore(/var/lib/ceph/osd/ceph-92) collection_list 5.1bf_head start > > #5:fd800000::::head# end #MAX# max 2147483647 > > 2019-02-27 11:00:26.142292 7f9592f84700 20 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list range > > 0x7f7ffffffffffffff9fd800000 to 0x7f7ffffffffffffff9fe000000 and > > 0x7f8000000000000005fd800000 to 0x7f8000000000000005fe000000 start > > #5:fd800000::::head# > > 2019-02-27 11:00:26.142805 7f9592f84700 20 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list pend > > 0x7f8000000000000005fe000000 > > 2019-02-27 11:00:26.142815 7f9592f84700 30 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list key > > 0x7f8000000000000005fd80000021213dfffffffffffffffeffffffffffffffff'o' > > 2019-02-27 11:00:26.142820 7f9592f84700 20 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list oid > > #5:fd800000::::head# end #MAX# > > 2019-02-27 11:00:51.796411 7f9592f84700 20 > > bluestore(/var/lib/ceph/osd/ceph-92) _collection_list key > > 0x7f800000000000001d04c0000021213dfffffffffffffffeffffffffffffffff'o' > >> = #MAX# > > 2019-02-27 11:00:51.796447 7f9592f84700 10 > > bluestore(/var/lib/ceph/osd/ceph-92) collection_list 5.1bf_head start > > #5:fd800000::::head# end #MAX# max 2147483647 = 0, ls.size() = 1, next > > = GHMIN > > 2019-02-27 11:00:51.796564 7f9592f84700 20 osd.92 pg_epoch: 13415 > > pg[5.1bf( empty local-lis/les=13413/13414 n=0 ec=175/175 lis/c > > 13413/13413 les/c/f 13414/13414/0 13413/13413/13413) [92,142,111] r=0 > > lpr=13413 crt=0'0 mlcod 0'0 active+clean+scrubbing+deep] scrub state > > WAIT_REPLICAS [5:fd800000::::head,MAX) > > > > I export the bluefs using ceph-bluestore-tool, and its size is about > > 5GB which locate on sata-ssd. > > I know little about rocksdb and cannot find the real reason. So what > > kind of problem about rocksdb ? > > Any suggestion, thanks! > > > > best regards.