Quick follow-up: Running "iotop -tobk" shows the culprits seem to be the "bstore_kv_sync" and "rocksdb:low27" threads. Here's the output from one iteration for the daemon process: b'07:36:22 513912 be/4 651 0.00 K/s 218342.12 K/s 0.00 % 0.18 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [bstore_kv_sync]' b'07:36:22 513467 be/4 651 0.00 K/s 134410.60 K/s 0.00 % 0.15 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [rocksdb:low27]' b'07:36:22 513941 be/4 651 0.00 K/s 63.61 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513942 be/4 651 0.00 K/s 141.36 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513943 be/4 651 0.00 K/s 10.60 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513944 be/4 651 0.00 K/s 3.53 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513945 be/4 651 0.00 K/s 14.14 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513946 be/4 651 0.00 K/s 151.96 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513947 be/4 651 0.00 K/s 7.07 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513948 be/4 651 28.27 K/s 925.90 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513949 be/4 651 0.00 K/s 148.43 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513950 be/4 651 0.00 K/s 7.07 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513951 be/4 651 0.00 K/s 14.14 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513952 be/4 651 0.00 K/s 7.07 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513953 be/4 651 0.00 K/s 14.14 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513954 be/4 651 0.00 K/s 24.74 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513955 be/4 651 0.00 K/s 141.36 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' b'07:36:22 513956 be/4 651 28.27 K/s 38.87 K/s 0.00 % 0.00 % ceph-osd -d --cluster ceph --id 9000 --setuser ceph --setgroup ceph [tp_osd_tp]' Is there any way to figure out what these threads are doing? -- Sam Clippinger -----Original Message----- From: Clippinger, Sam <Sam.Clippinger@xxxxxxxxxx> Sent: Friday, April 8, 2022 7:26 AM To: ceph-users@xxxxxxx Subject: OSD daemon writes constantly to device without Ceph traffic - bug? Hello everyone! I've noticed something strange since updating our cluster from Nautilus to Pacific 16.2.7. Out of 40 OSDs, one was created with Pacific 16.2.7 and all others in the cluster were created with Nautilus or Mimic (the daemons are all running Pacific). Every few days, the OSD created with Pacific will suddenly start writing to its device constantly. As I type this, it is writing 250-350 MiB/s to the drive (according to iotop). All other OSDs are writing about 15-30 MiB/s to their devices. Read activity is normal - all OSDs are reading 50-100 MiB/s. There isn't nearly enough client activity to justify this activity, the cluster is healthy, nothing is rebalancing or scrubbing. Using "ceph osd status" shows the OSD has about the same number of reads and writes as all the others. I tried using "ceph tell osd.X config set" to increase every debug_* option to its maximum setting, but nothing seemed to stand out. There was some additional output (not much) and it was mostly "bluestore.MempoolThread(x) _resize_shards", "prioritycache tune_memory", "heartbeat osd_stat" and "ms_handle_reset con". What else can I do to troubleshoot this? Is this a bug? Restarting the OSD daemon "fixes" it for a few days, then it always seems to start happening again. I'm planning to recreate all of the OSDs in this cluster this weekend (to split each NVMe drive into multiple OSDs), so I'm concerned about every OSD showing this behavior next week. Should I postpone this weekend's work? I haven't restarted the OSD daemon yet this morning, so I can still try some additional debugging while the writing is going on. -- Sam Clippinger ________________________________ CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient(s) and contain information that may be Garmin confidential and/or Garmin legally privileged. If you have received this email in error, please notify the sender by reply email and delete the message. Any disclosure, copying, distribution or use of this communication (including attachments) by someone other than the intended recipient is prohibited. Thank you. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx