On 22.09.20 22:09, Nico Schottelius wrote: [...] > All nodes are connected with 2x 10 Gbit/s bonded/LACP, so I'd expect at > least a couple of hundred MB/s network bandwidth per OSD. > > On one server I just restarted the OSDs and now the read performance > dropped down to 1-4 MB/s per OSD with being about 90% busy. > > Since nautilus we observed much longer starting times of OSDs and I > wonder if the osd does some kind of fsck these days and delays the > peering process because of that? > > The disks in question are 3.5"/10TB/6 Gbit/s SATA disks connected to an > H800 controller - so generally speaking I do not see a reasonable > bottleneck here. Yes, I should! I saw in your mail: 1.) 1532 slow requests are blocked > 32 sec 789 slow ops, oldest one blocked for 1949 sec, daemons [osd.12,osd.14,osd.2,osd.20,osd.23,osd.25,osd.3,osd.33,osd.35,osd.50]... have slow ops. An request that is blocked for > 32 sec is odd! Same goes for 1949 sec. I my experience, they will never finish. Sometimes they go away with osd restarts. Are those OSD the ones you relocated? 2.) client: 91 MiB/s rd, 28 MiB/s wr, 1.76k op/s rd, 686 op/s wr recovery: 67 MiB/s, 17 objects/s 67 MB/sec is slower than a single rotational disk can deliver. Even 67 + 91 MB/s is not much, especially not for an 85 OSD @ 10G cluster. The ~2500 IOPS client I/O will translate to 7500 "net" IOPS with pook size 3, maybe that is the limit. But I guess you already know that. But before tuning, you should probably listen to Frank's advice about the placements (See other post). ASAP the unknown OSDs come back, the speed will probably go up due to parallelism. rgds, j. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx