Re: [PATCH v2 06/12] bcache: set error_limit correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Tang Junhui <tang.junhui@xxxxxxxxxx>

Hello Coly:

Then in bch_count_io_errors(), why did us still keep these code:
> 92             unsigned errors = atomic_add_return(1 << IO_ERROR_SHIFT,
> 93                                                 &ca->io_errors);
> 94             errors >>= IO_ERROR_SHIFT;

why not just modify the code as bellow:
> 92             unsigned errors = atomic_add_return(1,
> 93                                                 &ca->io_errors);


>Struct cache uses io_errors for two purposes,
>- Error decay: when cache set error_decay is set, io_errors is used to
>  generate a small piece of delay when I/O error happens.
>- I/O errors counter: in order to generate big enough value for error
>  decay, I/O errors counter value is stored by left shifting 20 bits (a.k.a
>  IO_ERROR_SHIFT).
>
>In function bch_count_io_errors(), if I/O errors counter reaches cache set
>error limit, bch_cache_set_error() will be called to retire the whold cache
>set. But current code is problematic when checking the error limit, see the
>following code piece from bch_count_io_errors(),
>
> 90     if (error) {
> 91             char buf[BDEVNAME_SIZE];
> 92             unsigned errors = atomic_add_return(1 << IO_ERROR_SHIFT,
> 93                                                 &ca->io_errors);
> 94             errors >>= IO_ERROR_SHIFT;
> 95
> 96             if (errors < ca->set->error_limit)
> 97                     pr_err("%s: IO error on %s, recovering",
> 98                            bdevname(ca->bdev, buf), m);
> 99             else
>100                     bch_cache_set_error(ca->set,
>101                                         "%s: too many IO errors %s",
>102                                         bdevname(ca->bdev, buf), m);
>103     }
>
>At line 94, errors is right shifting IO_ERROR_SHIFT bits, now it is real
>errors counter to compare at line 96. But ca->set->error_limit is initia-
>lized with an amplified value in bch_cache_set_alloc(),
>1545         c->error_limit  = 8 << IO_ERROR_SHIFT;
>
>It means by default, in bch_count_io_errors(), before 8<<20 errors happened
>bch_cache_set_error() won't be called to retire the problematic cache
>device. If the average request size is 64KB, it means bcache won't handle
>failed device until 512GB data is requested. This is too large to be an I/O
>threashold. So I believe the correct error limit should be much less.
>
>This patch sets default cache set error limit to 8, then in
>bch_count_io_errors() when errors counter reaches 8 (if it is default
>value), function bch_cache_set_error() will be called to retire the whole
>cache set. This patch also removes bits shifting when store or show
>io_error_limit value via sysfs interface.
>
>Nowadays most of SSDs handle internal flash failure automatically by LBA
>address re-indirect mapping. If an I/O error can be observed by upper layer
>code, it will be a notable error because that SSD can not re-indirect
>map the problematic LBA address to an available flash block. This situation
>indicates the whole SSD will be failed very soon. Therefore setting 8 as
>the default io error limit value makes sense, it is enough for most of
>cache devices.
>
>Changelog:
>v2: add reviewed-by from Hannes.
>v1: initial version for review.
>
>Signed-off-by: Coly Li <colyli@xxxxxxx>
>Reviewed-by: Hannes Reinecke <hare@xxxxxxxx>
>Cc: Michael Lyle <mlyle@xxxxxxxx>
>Cc: Junhui Tang <tang.junhui@xxxxxxxxxx>
>---
> drivers/md/bcache/bcache.h | 1 +
> drivers/md/bcache/super.c  | 2 +-
> drivers/md/bcache/sysfs.c  | 4 ++--
> 3 files changed, 4 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
>index 88d938c8d027..7d7512fa4f09 100644
>--- a/drivers/md/bcache/bcache.h
>+++ b/drivers/md/bcache/bcache.h
>@@ -663,6 +663,7 @@ struct cache_set {
>         ON_ERROR_UNREGISTER,
>         ON_ERROR_PANIC,
>     }            on_error;
>+#define DEFAULT_IO_ERROR_LIMIT 8
>     unsigned        error_limit;
>     unsigned        error_decay;
> 
>diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
>index 6d888e8fea8c..a373648b5d4b 100644
>--- a/drivers/md/bcache/super.c
>+++ b/drivers/md/bcache/super.c
>@@ -1583,7 +1583,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
> 
>     c->congested_read_threshold_us    = 2000;
>     c->congested_write_threshold_us    = 20000;
>-    c->error_limit    = 8 << IO_ERROR_SHIFT;
>+    c->error_limit    = DEFAULT_IO_ERROR_LIMIT;
> 
>     return c;
> err:
>diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
>index b7166c504cdb..ba62e987b503 100644
>--- a/drivers/md/bcache/sysfs.c
>+++ b/drivers/md/bcache/sysfs.c
>@@ -560,7 +560,7 @@ SHOW(__bch_cache_set)
> 
>     /* See count_io_errors for why 88 */
>     sysfs_print(io_error_halflife,    c->error_decay * 88);
>-    sysfs_print(io_error_limit,    c->error_limit >> IO_ERROR_SHIFT);
>+    sysfs_print(io_error_limit,    c->error_limit);
> 
>     sysfs_hprint(congested,
>              ((uint64_t) bch_get_congested(c)) << 9);
>@@ -660,7 +660,7 @@ STORE(__bch_cache_set)
>     }
> 
>     if (attr == &sysfs_io_error_limit)
>-        c->error_limit = strtoul_or_return(buf) << IO_ERROR_SHIFT;
>+        c->error_limit = strtoul_or_return(buf);
> 
>     /* See count_io_errors() for why 88 */
>     if (attr == &sysfs_io_error_halflife)
>-- 
>2.15.1

Thanks,
Tang Junhui



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux