On Mon, Jun 08, 2020 at 09:07:24PM -0700, Josh Snyder wrote: > Previously, we performed truncation of I/O issue/completion times during > calculation of io_ticks, counting only I/Os which cross a jiffy > boundary. The effect is a sampling of I/Os: at every boundary between > jiffies we ask "is there an outstanding I/O" and increment a counter if > the answer is yes. This produces results that are accurate (they don't > systematically over- or under-count), but not precise (there is high > variance associated with only taking 100 samples per second). > > This change modifies the sampling rate from 100Hz to 976562.5Hz (1 > sample per 1024 nanoseconds). I chose this sampling rate by simulating a > workload in which I/Os are issued randomly (by a Poisson process), and > processed in constant time: an M/D/∞ system (Kendall's notation). My > goal was to produce a sampled utilization fraction which was correct to > one part-per-thousand given one second of samples. > > The tradeoff of the higher sampling rate is increased synchronization > overhead caused by more frequent compare-and-swap operations. The > technique of commit 5b18b5a73760 ("block: delete part_round_stats and > switch to less precise counting") is to allow multiple I/Os to complete > while performing only one synchronized operation. As we are increasing > the sample rate by a factor of 10000, we will less frequently be able to > exercise the synchronization-free code path. Not sure if we need so precise %util, and ~1M sampling rate may cause to run cmpxchg() 1M/sec for each partition, which overhead might be observable. Thanks, Ming