[bug report] A race between blk_cleanup_queue and blk_timeout_work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jens, Christoph,

There is a scenario: unplug this disks when running IO in disk, we will find IO is blocked all the times as follows:
......
Jobs: 3 (f=3): [M_MM__] [89.7% done] [0K/0K /s] [0 /0  iops] [eta 00m:36s]
......
I find there is a race between blk_cleanup_queue and blk_timeout_work (kernel is 4.14.0-rc1):
(1)Remove disks process
When unplug disk, it will call scsi_remove_target to delete disk:
scsi_remove_target------>
     __scsi_remove_target---->
        scsi_remove_device--->
            __scsi_remove_device--->
                blk_cleanup_queue
                    blk_freeze_queue
                    .....
                    __blk_drain_queue
scsi_remove_target will call blk_cleanup_queue, and blk_cleanup_queue will call blk_freeze_queue and __blk_drain_queue. In blk_freeze_queue, for !blk_mq (our driver satifies this) it will kill q->q_usage_counter. In __blk_drain_queue, it is a loop with condition=true, only when drain=0 can this function will be existed.If all the IOs are ended, it will be existed, or it will wait and query no-finished IOs every 10ms.
(2) Timeout process
For every IO from block layer,if timeout, it will call blk_timeout_work. In blk_timeout_work, it checks blk_queue_enter first. In blk_queue_enter, it trys to get q->q_usage_counter, so if failed, it will return directly and will not enter timeout process.

So when unplug disk, removing disk process will kill q->q_usage_counter in blk_cleanup_queue, if there are IOs which are not finished, they will wait for timeout, when timeout, they will try to get q->q_usage_counter in blk_timeout_work, as q->q_usage_counter is killed in blk_freeze_queue already at that time, so it failed, it will not enter timeout process and this IO will be not processed. But in __blk_drain_queue it will loop forever as there are IOs which are still not ended.

I add printk in function blk_timeout_work as follows, . when this issue occurs, i can see this printk happens:

void blk_timeout_work(struct work_struct *work)
{
        struct request_queue *q =
                container_of(work, struct request_queue, timeout_work);
        unsigned long flags, next = 0;
        struct request *rq, *tmp;
        int next_set = 0;

        if (blk_queue_enter(q, true)) {
pr_err("%s %d\n", __func__, __LINE__);---------------------> i add printk here
                return;
        }

        spin_lock_irqsave(q->queue_lock, flags);


        list_for_each_entry_safe(rq, tmp, &q->timeout_list, timeout_list)
                blk_rq_check_expired(rq, &next, &next_set);

        if (next_set)
                mod_timer(&q->timeout, round_jiffies_up(next));

        spin_unlock_irqrestore(q->queue_lock, flags);

        blk_queue_exit(q);
}


regards,
shawn




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux