Hi Coly Li-- Just a couple of questions. On 02/27/2018 08:55 AM, Coly Li wrote: > +#define BACKING_DEV_OFFLINE_TIMEOUT 5 I think you wanted this to be 30 (per commit message)-- was this turned down for testing or deliberate? > +static int cached_dev_status_update(void *arg) > +{ > + struct cached_dev *dc = arg; > + struct request_queue *q; > + char buf[BDEVNAME_SIZE]; > + > + /* > + * If this delayed worker is stopping outside, directly quit here. > + * dc->io_disable might be set via sysfs interface, so check it > + * here too. > + */ > + while (!kthread_should_stop() && !dc->io_disable) { > + q = bdev_get_queue(dc->bdev); > + if (blk_queue_dying(q)) I am not sure-- is this the correct test to know if the bdev is offline? It's very sparsely used outside of the core block system. (Is there any scenario where the queue comes back? Can the device be "offline" and still have a live queue? Another approach might be to set io_disable when there's an extended period without a successful IO, and it might be possible to do this with atomics without a thread). This approach can work if you're certain that blk_queue_dying is an appropriate test for all failure scenarios (I just don't know). Mike