Env: - OS: Ubuntu 20.04 - Ceph Version: Octopus 15.0.0.1 - OSD Disk: 2.9TB NVMe - BlockStorage (Replication 3) Symptom: - Peering when OSD's node up is very slow. Peering speed varies from PG to PG, and some PG may even take 10 seconds. But, there is no log for 10 seconds. - I checked the effect of client VM's. Actually, Slow queries of mysql occur at the same time. There are Ceph OSD logs of both Best and Worst. Best Peering Case (0.5 Seconds) 2024-04-11T15:32:44.693+0900 7f108b522700 1 osd.7 pg_epoch: 27368 pg[6.8] state<Start>: transitioning to Primary 2024-04-11T15:32:45.165+0900 7f108f52a700 1 osd.7 pg_epoch: 27371 pg[6.8] state<Started/Primary/Peering>: Peering, affected_by_map, going to Reset 2024-04-11T15:32:45.165+0900 7f108f52a700 1 osd.7 pg_epoch: 27371 pg[6.8] start_peering_interval up [7,6,11] -> [6,11], acting [7,6,11] -> [6,11], acting_primary 7 -> 6, up_primary 7 -> 6, role 0 -> -1, features acting 2024-04-11T15:32:45.165+0900 7f108f52a700 1 osd.7 pg_epoch: 27377 pg[6.8] state<Start>: transitioning to Primary 2024-04-11T15:32:45.165+0900 7f108f52a700 1 osd.7 pg_epoch: 27377 pg[6.8] start_peering_interval up [6,11] -> [7,6,11], acting [6,11] -> [7,6,11], acting_primary 6 -> 7, up_primary 6 -> 7, role -1 -> 0, features acting Worst Peering Case (11.6 Seconds) 2024-04-11T15:32:45.169+0900 7f108b522700 1 osd.7 pg_epoch: 27377 pg[30.20] state<Start>: transitioning to Stray 2024-04-11T15:32:45.169+0900 7f108b522700 1 osd.7 pg_epoch: 27377 pg[30.20] start_peering_interval up [0,1] -> [0,7,1], acting [0,1] -> [0,7,1], acting_primary 0 -> 0, up_primary 0 -> 0, role -1 -> 1, features acting 2024-04-11T15:32:46.173+0900 7f108b522700 1 osd.7 pg_epoch: 27378 pg[30.20] state<Start>: transitioning to Stray 2024-04-11T15:32:46.173+0900 7f108b522700 1 osd.7 pg_epoch: 27378 pg[30.20] start_peering_interval up [0,7,1] -> [0,7,1], acting [0,7,1] -> [0,1], acting_primary 0 -> 0, up_primary 0 -> 0, role 1 -> -1, features acting 2024-04-11T15:32:57.794+0900 7f108b522700 1 osd.7 pg_epoch: 27390 pg[30.20] state<Start>: transitioning to Stray 2024-04-11T15:32:57.794+0900 7f108b522700 1 osd.7 pg_epoch: 27390 pg[30.20] start_peering_interval up [0,7,1] -> [0,7,1], acting [0,1] -> [0,7,1], acting_primary 0 -> 0, up_primary 0 -> 0, role -1 -> 1, features acting *I wish to know about* - Why some PG's take 10 seconds until Peering finishes. - Why Ceph log is quiet during peering. - Is this symptom intended in Ceph. *And please give some advice,* - Is there any way to improve peering speed? - Or, Is there a way to not affect the client when peering occurs? P.S - I checked the symptoms in the following environments. -> Octopus Version, Reef Version, Cephadm, Ceph-Ansible _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx