Hi Veronica, On Wed, 9 Mar 2016, Veronica Estrada Galinanes wrote: > From ceph website: "Deep scrubbing (weekly) reads the data and uses > checksums to ensure data integrity.” > > 1. Why do you use "weekly" deep scrubbing? How many times a year one disk > is scrubbed on average in a Ceph cluster? Schwartz* used three times per > year. “Interestingly, we did not observe a further decrease in data loss as > we increased the scrubbing frequency from three to eleven times per year; > rather, a slight increase in data loss was noticed. This phenomenon comes > from the POH (power-on-hour) effect on drive reliability. Aggressive > scrubbing requires more power cycles, adversely affecting drive reliability. > " In Ceph clusters disks are generally never powered down. It sounds like Schwarz et al were studying a different type of storage system. In our case, it's all about how quickly you discover a defect and repair around it. > 2. Do you have different scrubbing policies according to the redundancy > method? For example, more scrubs when using replication rather than erasure > coding? We do let you change the scrub intervals on a per-pool basis as well (see pool_opts in osd/osd_types.h), so you can adjust the policy based on both the pool type (replicated vs erasure coded) or the importance of the data it contains. sage