On 17/01/2018 2:09 PM, tang.junhui@xxxxxxxxxx wrote: > From: Tang Junhui <tang.junhui@xxxxxxxxxx> > > Hello Coly: > > Then in bch_count_io_errors(), why did us still keep these code: >> 92 unsigned errors = atomic_add_return(1 << IO_ERROR_SHIFT, >> 93 &ca->io_errors); >> 94 errors >>= IO_ERROR_SHIFT; > > why not just modify the code as bellow: >> 92 unsigned errors = atomic_add_return(1, >> 93 &ca->io_errors); > > Hi Junhui, It is because of ca->set->error_decay. When error_decay is set, bcache tries to do an exponential decay for error count. That is, error numbers is decaying against the quantity of io count, this is to avoid long time accumulated errors exceeds error_limit and trigger bch_cache_set_error(). The I/O error counter, uses most significant 12 bits to record real errors number. And too many I/Os may decay the errors number too fast, so left shit it by 20 bits to make sure there are still enough errors number after a while (e.g. the halflife). The we don't use the left shifting, when error_deay is set, and too many I/Os happen, after a small piece of time, number of I/O errors will be decayed too fast to reach error_limit. Therefore IMHO the left shifting is necessary. BTW, I doubt whether current exponential error decay in bch_count_io_errors() works properly. Because I don't catch the connection between IO counter (ca->io_count) and error counter (ca->io_errors). By default ca->set->error_decay is 0, so I doubt how many people use/test this piece of code. (Maybe I am wrong) Coly Li [code snipped]