Re: Rocksdb compaction and OSD timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

> On 7 Sep 2023, at 10:05, J-P Methot <jp.methot@xxxxxxxxxxxxxxxxx> wrote:
> 
> We're running latest Pacific on our production cluster and we've been seeing the dreaded 'OSD::osd_op_tp thread 0x7f346aa64700' had timed out after 15.000000954s' error. We have reasons to believe this happens each time the RocksDB compaction process is launched on an OSD. My question is, does the cluster detecting that an OSD has timed out interrupt the compaction process? This seems to be what's happening, but it's not immediately obvious. We are currently facing an infinite loop of random OSDs timing out and if the compaction process is interrupted without finishing, it may explain that.

You run the online compacting for this OSD's (`ceph osd compact ${osd_id}` command), right?



k
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux