Is there a way that I can check if this process is causing performance issues? Can I check somehow if this process is causing the issue? Op di 6 okt. 2020 om 13:05 schreef Igor Fedotov <ifedotov@xxxxxxx>: > > On 10/6/2020 1:04 PM, Kristof Coucke wrote: > > Another strange thing is going on: > > No client software is using the system any longer, so we would expect that > all IOs are related to the recovery (fixing of the degraded PG). > However, the disks that are reaching high IO are not a member of the PGs > that are being fixed. > > So, something is heavily using the disk, but I can't find the process > immediately. I've read something that there can be old client processes > that keep on connecting to an OSD for retrieving data for a specific PG > while that PG is no longer available on that disk. > > > I bet it's rather PG removal happening in background.... > > > Op di 6 okt. 2020 om 11:41 schreef Kristof Coucke < > kristof.coucke@xxxxxxxxx>: > >> Yes, some disks are spiking near 100%... The delay I see with the iostat >> (r_await) seems to be synchronised with the delays between queued_for_pg >> and reached_pg events. >> The NVMe disks are not spiking, just the spinner disks. >> >> I know the rocksdb is only partial on the NVMe. The read-ahead is also >> 128kb (os level) (for spinner disks). As we are dealing with smaller files, >> this might also lead to a decrease of the performance. >> >> I'm still investigating, but I'm wondering if the system is also reading >> from disk for finding the KV pairs. >> >> >> >> Op di 6 okt. 2020 om 11:23 schreef Igor Fedotov <ifedotov@xxxxxxx>: >> >>> Hi Kristof, >>> >>> are you seeing high (around 100%) OSDs' disks (main or DB ones) >>> utilization along with slow ops? >>> >>> >>> Thanks, >>> >>> Igor >>> >>> On 10/6/2020 11:09 AM, Kristof Coucke wrote: >>> > Hi all, >>> > >>> > We have a Ceph cluster which has been expanded from 10 to 16 nodes. >>> > Each node has between 14 and 16 OSDs of which 2 are NVMe disks. >>> > Most disks (except NVMe's) are 16TB large. >>> > >>> > The expansion of 16 nodes went ok, but we've configured the system to >>> > prevent auto balance towards the new disks (weight was set to 0) so we >>> > could control the expansion. >>> > >>> > We started adding 6 disks last week (1 disk on each new node) which >>> didn't >>> > give a lot of issues. >>> > When the Ceph status indicated the PG degraded was almost finished, >>> we've >>> > added 2 disks on each node again. >>> > >>> > All seemed to go fine, till yesterday morning... IOs towards the system >>> > were slowing down. >>> > >>> > Diving onto the nodes we could see that the OSD daemons are consuming >>> the >>> > CPU power, resulting in average CPU loads going near 10 (!). >>> > >>> > The RGWs nor monitors nor other involved servers are having CPU issues >>> > (except for the management server which is fighting with Prometheus), >>> so >>> > it's latency seems to be related to the ODS hosts. >>> > All of the hosts are interconnected with 25Gbit connections, no >>> bottlenecks >>> > are reached on the network either. >>> > >>> > Important piece of information: We are using erasure coding (6/3), and >>> we >>> > do have a lot of small files... >>> > The current health detail indicates degraded health redundancy where >>> > 1192911/103387889228 objects are degraded. (1 pg degraded, 1 pg >>> undersized). >>> > >>> > Diving into the historic ops of an OSD we can see that the main >>> latency is >>> > found between the event "queued_for_pg" and "reached_pg". (Averaging >>> +/- 3 >>> > secs) >>> > >>> > As the system load is quite high I assume the systems are busy >>> > recalculating the code chunks for using the new disks we've added >>> (though >>> > not sure), but I was wondering how I can better fine tune the system or >>> > pinpoint the exact bottle neck. >>> > Latency towards the disks doesn't seem an issue at first sight... >>> > >>> > We are running Ceph 14.2.11 >>> > >>> > Who can give me some thoughts on how I can better pinpoint the bottle >>> neck? >>> > >>> > Thanks >>> > >>> > Kristof >>> > _______________________________________________ >>> > ceph-users mailing list -- ceph-users@xxxxxxx >>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx