Hello again, Getting back to this: On Sun, 4 Aug 2019 10:47:27 +0900 Christian Balzer wrote: > Hello, > > preparing the first production bluestore, nautilus (latest) based cluster > I've run into the same things other people and myself ran into before. > > Firstly HW, 3 nodes with 12 SATA HDDs each, IT mode LSI 3008, wal/db on > 40GB SSD partitions. (boy do I hate the inability of ceph-volume to deal > with raw partitions). > SSDs aren't a bottleneck in any scenario. > Single E5-1650 v3 @ 3.50GHz, cpu isn't a bottleneck in any scenario, less > than 15% of a core per OSD. > > Connection is via 40GB/s infiniband, IPoIB, no issues here as numbers later > will show. > > Clients are KVMs on Epyc based compute nodes, maybe some more speed could > be squeezed out here with different VM configs, but the cpu isn't an issue > in the problem cases. > > > > 1. 4k random I/O can cause degraded PGs > I've run into the same/similar issue as Nathan Fish here: > https://www.spinics.net/lists/ceph-users/msg526 > During the first 2 tests with 4k random I/O I got shortly degraded PGs as > well, with no indication in CPU or SSD utilization accounting for this. > HDDs were of course busy at that time. > Wasn't able to reproduce this so far, but it leaves me less than > confident. > > This happened again yesterday when rsyncing 260GB of average 4MB files into a Ceph image backed VM. Given the nature of this rsync nothing on the ceph nodes was the least bit busy, the HDDs were all below 15% utilization, CPU bored, etc. Still we got: --- 2019-08-07 15:38:23.452580 osd.21 (osd.21) 651 : cluster [DBG] 1.125 starting backfill to osd.9 from (0'0,0'0] MAX to 1297'21584 2019-08-07 15:38:24.454942 mon.ceph-05 (mon.0) 182756 : cluster [WRN] Health check failed: Reduced data availability: 2 pgs peering (PG_AVAILABILITY) 2019-08-07 15:38:25.396756 mon.ceph-05 (mon.0) 182757 : cluster [DBG] osdmap e1302: 36 total, 36 up, 36 in 2019-08-07 15:38:23.452026 osd.12 (osd.12) 767 : cluster [DBG] 1.105 starting backfill to osd.25 from (0'0,0'0] MAX to 1297'6782 --- Unfortunately all I have in the OSD log is this: --- 2019-08-07 15:38:23.461 7f155e71b700 1 osd.9 pg_epoch: 1299 pg[1.125( empty local-lis/les=0/0 n=0 ec=189/189 lis/c 1286/1286 les/c/f 1287/1287/0 1298/1299/189) [21,9,28]/[21,28,3] r=-1 lpr=1299 pi=[1286,1299)/1 crt=0'0 unknown mbc={}] state<Start>: transitioning to Stray 2019-08-07 15:38:24.353 7f155e71b700 1 osd.9 pg_epoch: 1301 pg[1.125( v 1297'21584 (1246'18584,1297'21584] local-lis/les=1299/1300 n=5 ec=189/189 lis/c 1299/1299 les/c/f 1300/1300/0 1298/1301/189) [21,9,28] r=1 lpr=1301 pi=[1299,1301)/1 luod=0'0 crt=1297'21584 active mbc={}] start_peering_interval up [21,9,28] -> [21,9,28], acting [21,28,3] -> [21,9,28], acting_primary 21 -> 21, up_primary 21 -> 21, role -1 -> 1, features acting 4611087854031667199 upacting 4611087854031667199 2019-08-07 15:38:24.353 7f155e71b700 1 osd.9 pg_epoch: 1301 pg[1.125( v 1297'21584 (1246'18584,1297'21584] local-lis/les=1299/1300 n=5 ec=189/189 lis/c 1299/1299 les/c/f 1300/1300/0 1298/1301/189) [21,9,28] r=1 lpr=1301 pi=[1299,1301)/1 crt=1297'21584 unknown NOTIFY mbc={}] state<Start>: transitioning to Stray --- How can I find out what happened here, given that it might not happen again anytime soon cranking up debug levels now is a tad late. Thanks, Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Rakuten Mobile Inc. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx