On 01/14/2018 03:42 PM, Coly Li wrote: > Currently bcache does not handle backing device failure, if backing > device is offline and disconnected from system, its bcache device can still > be accessible. If the bcache device is in writeback mode, I/O requests even > can success if the requests hit on cache device. That is to say, when and > how bcache handles offline backing device is undefined. > > This patch tries to handle backing device offline in a rather simple way, > - Add cached_dev->status_update_thread kernel thread to update backing > device status in every 1 second. > - Add cached_dev->offline_seconds to record how many seconds the backing > device is observed to be offline. If the backing device is offline for > BACKING_DEV_OFFLINE_TIMEOUT (30) seconds, set dc->io_disable to 1 and > call bcache_device_stop() to stop the bache device which linked to the > offline backing device. > > Now if a backing device is offline for BACKING_DEV_OFFLINE_TIMEOUT seconds, > its bcache device will be removed, then user space application writing on > it will get error immediately, and handler the device failure in time. > > This patch is quite simple, does not handle more complicated situations. > Once the bcache device is stopped, users need to recovery the backing > device, register and attach it manually. > > Signed-off-by: Coly Li <colyli@xxxxxxx> > Cc: Michael Lyle <mlyle@xxxxxxxx> > Cc: Hannes Reinecke <hare@xxxxxxxx> > Cc: Junhui Tang <tang.junhui@xxxxxxxxxx> > --- > drivers/md/bcache/bcache.h | 2 ++ > drivers/md/bcache/super.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 57 insertions(+) > > diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h > index 5a811959392d..9eedb35d01bc 100644 > --- a/drivers/md/bcache/bcache.h > +++ b/drivers/md/bcache/bcache.h > @@ -338,6 +338,7 @@ struct cached_dev { > > struct keybuf writeback_keys; > > + struct task_struct *status_update_thread; > /* > * Order the write-half of writeback operations strongly in dispatch > * order. (Maintain LBA order; don't allow reads completing out of > @@ -384,6 +385,7 @@ struct cached_dev { > #define DEFAULT_CACHED_DEV_ERROR_LIMIT 64 > atomic_t io_errors; > unsigned error_limit; > + unsigned offline_seconds; > }; > > enum alloc_reserve { > diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c > index 14fce3623770..85adf1e29d11 100644 > --- a/drivers/md/bcache/super.c > +++ b/drivers/md/bcache/super.c > @@ -646,6 +646,11 @@ static int ioctl_dev(struct block_device *b, fmode_t mode, > unsigned int cmd, unsigned long arg) > { > struct bcache_device *d = b->bd_disk->private_data; > + struct cached_dev *dc = container_of(d, struct cached_dev, disk); > + > + if (dc->io_disable) > + return -EIO; > + > return d->ioctl(d, mode, cmd, arg); > } > > @@ -856,6 +861,45 @@ static void calc_cached_dev_sectors(struct cache_set *c) > c->cached_dev_sectors = sectors; > } > > +#define BACKING_DEV_OFFLINE_TIMEOUT 5 > +static int cached_dev_status_update(void *arg) > +{ > + struct cached_dev *dc = arg; > + struct request_queue *q; > + char buf[BDEVNAME_SIZE]; > + > + /* > + * If this delayed worker is stopping outside, directly quit here. > + * dc->io_disable might be set via sysfs interface, so check it > + * here too. > + */ > + while (!kthread_should_stop() && !dc->io_disable) { > + q = bdev_get_queue(dc->bdev); > + if (blk_queue_dying(q)) > + dc->offline_seconds++; > + else > + dc->offline_seconds = 0; > + > + if (dc->offline_seconds >= BACKING_DEV_OFFLINE_TIMEOUT) { > + pr_err("%s: device offline for %d seconds", > + bdevname(dc->bdev, buf), > + BACKING_DEV_OFFLINE_TIMEOUT); > + pr_err("%s: disable I/O request due to backing " > + "device offline", dc->disk.name); > + dc->io_disable = true; > + /* let others know earlier that io_disable is true */ > + smp_mb(); > + bcache_device_stop(&dc->disk); > + break; > + } > + > + schedule_timeout_interruptible(HZ); > + } > + > + dc->status_update_thread = NULL; > + return 0; > +} > + > void bch_cached_dev_run(struct cached_dev *dc) > { > struct bcache_device *d = &dc->disk; > @@ -898,6 +942,15 @@ void bch_cached_dev_run(struct cached_dev *dc) > if (sysfs_create_link(&d->kobj, &disk_to_dev(d->disk)->kobj, "dev") || > sysfs_create_link(&disk_to_dev(d->disk)->kobj, &d->kobj, "bcache")) > pr_debug("error creating sysfs link"); > + > + dc->status_update_thread = kthread_run(cached_dev_status_update, > + dc, > + "bcache_status_update"); > + if (IS_ERR(dc->status_update_thread)) { > + pr_warn("bcache: failed to create bcache_status_update " > + "kthread, continue to run without monitoring backing " > + "device status"); > + } > } > > /* > @@ -1118,6 +1171,8 @@ static void cached_dev_free(struct closure *cl) > kthread_stop(dc->writeback_thread); > if (dc->writeback_write_wq) > destroy_workqueue(dc->writeback_write_wq); > + if (!IS_ERR_OR_NULL(dc->status_update_thread)) > + kthread_stop(dc->status_update_thread); > > if (atomic_read(&dc->running)) > bd_unlink_disk_holder(dc->bdev, dc->disk.disk); > Hmm. Not exactly thrilled with this solution; maybe worth discussing it at LSF. But I can't see how it could be done better currently. Reviewed-by: Hannes Reinecke <hare@xxxxxxxx> Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@xxxxxxx +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html