Hi,guys
I have a 42 nodes cluster,and I create the pool using expected_num_objects to pre-split filestore dirs.today I rebuild a osd because a disk error,it cause much slow request,filestore logs like below2018-11-26 16:49:41.003336 7f2dad075700 10 filestore(/home/ceph/var/lib/osd/ceph-4) create_collection /home/ceph/var/lib/osd/ceph-4/current/388.433_head = 02018-11-26 16:49:41.003479 7f2dad075700 10 filestore(/home/ceph/var/lib/osd/ceph-4) create_collection /home/ceph/var/lib/osd/ceph-4/current/388.433_TEMP = 02018-11-26 16:49:41.003570 7f2dad075700 10 filestore(/home/ceph/var/lib/osd/ceph-4) _set_replay_guard 33.0.02018-11-26 16:49:41.003591 7f2dad876700 5 filestore(/home/ceph/var/lib/osd/ceph-4) _journaled_ahead 0x55e054382300 seq 81 osr(388.2bd 0x55e053ed9280) [Transaction(0x55e06d304680)]2018-11-26 16:49:41.003603 7f2dad876700 5 filestore(/home/ceph/var/lib/osd/ceph-4) queue_op 0x55e054382300 seq 81 osr(388.2bd 0x55e053ed9280) 1079089 bytes (queue has 50 opsand 15513428 bytes)2018-11-26 16:49:41.003608 7f2dad876700 10 filestore(/home/ceph/var/lib/osd/ceph-4) queueing ondisk 0x55e06cc83f802018-11-26 16:49:41.024714 7f2d9d055700 5 filestore(/home/ceph/var/lib/osd/ceph-4) queue_transactions existing 0x55e053a5d1e0 osr(388.f2a 0x55e053ed92e0)2018-11-26 16:49:41.166512 7f2dac874700 10 filestore oid: #388:c9400000::::head# not skipping op, *spos 32.0.12018-11-26 16:49:41.166522 7f2dac874700 10 filestore > header.spos 0.0.02018-11-26 16:49:41.170670 7f2dac874700 10 filestore oid: #388:c9400000::::head# not skipping op, *spos 32.0.22018-11-26 16:49:41.170680 7f2dac874700 10 filestore > header.spos 0.0.02018-11-26 16:49:41.183259 7f2dac874700 10 filestore(/home/ceph/var/lib/osd/ceph-4) _do_op 0x55e05ddb3480 seq 32 r = 0, finisher 0x55e051d122e0 02018-11-26 16:49:41.187211 7f2dac874700 10 filestore(/home/ceph/var/lib/osd/ceph-4) _finish_op 0x55e05ddb3480 seq 32 osr(388.293 0x55e053ed84b0)/0x55e053ed84b0 lat 47.8045332018-11-26 16:49:41.187232 7f2dac874700 5 filestore(/home/ceph/var/lib/osd/ceph-4) _do_op 0x55e052113e60 seq 34 osr(388.2d94 0x55e053ed91c0)/0x55e053ed91c0 start2018-11-26 16:49:41.187236 7f2dac874700 10 filestore(/home/ceph/var/lib/osd/ceph-4) _do_transaction on 0x55e05e0221402018-11-26 16:49:41.187239 7f2da4864700 5 filestore(/home/ceph/var/lib/osd/ceph-4) queue_transactions (writeahead) 82 [Transaction(0x55e0559e6d80)]looks like it is very slow when create pg dir like: /home/ceph/var/lib/osd/ceph-4/current/388.433but at the start of service,the status of osd is not up,it works well. no slow request,and pg dir is creating.but when the osd state is up,slow request is coming and pg dir is creating.when I disable the config filestore merge threshold = -10 in the ceoh.conf.the rebuild process works well,pg dirs are created very fast.then I see dir split in log2018-11-26 19:16:56.406276 7f768b189700 1 _created [8,F,8] has 593 objects, starting split.2018-11-26 19:16:56.977392 7f768b189700 1 _created [8,F,8] split completed.2018-11-26 19:16:57.032567 7f768b189700 1 _created [8,F,8,6] has 594 objects, starting split.2018-11-26 19:16:57.814694 7f768b189700 1 _created [8,F,8,6] split completed.
so,how can I set to let all pg dirs created before the osd state is up?or other solution.Thanks.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com