Re: ed74ae0342 ("blk-mq: Avoid that a completion can be ignored .."): BUG: kernel hang in test stage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/24/18 3:00 PM, kernel test robot wrote:
> Greetings,
> 
> 0day kernel testing robot got the below dmesg and the first bad commit is
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-linus
> 
> commit ed74ae03424684a6ad8a973c3fa727c6b4162432
> Author:     Bart Van Assche <bart.vanassche@xxxxxxx>
> AuthorDate: Thu Apr 19 09:43:53 2018 -0700
> Commit:     Jens Axboe <axboe@xxxxxxxxx>
> CommitDate: Thu Apr 19 14:21:47 2018 -0600
> 
>     blk-mq: Avoid that a completion can be ignored for BLK_EH_RESET_TIMER
>     
>     The blk-mq timeout handling code ignores completions that occur after
>     blk_mq_check_expired() has been called and before blk_mq_rq_timed_out()
>     has reset rq->aborted_gstate. If a block driver timeout handler always
>     returns BLK_EH_RESET_TIMER then the result will be that the request
>     never terminates.
>     
>     Fix this race as follows:
>     - Use the deadline instead of the request generation to detect whether
>       or not a request timer fired after reinitialization of a request.
>     - Store the request state in the lowest two bits of the deadline instead
>       of the lowest two bits of 'gstate'.
>     - Rename MQ_RQ_STATE_MASK into RQ_STATE_MASK and change it from an
>       enumeration member into a #define such that its type can be changed
>       into unsigned long. That allows to write & ~RQ_STATE_MASK instead of
>       ~(unsigned long)RQ_STATE_MASK.
>     - Remove all request member variables that became superfluous due to
>       this change: gstate, gstate_seq and aborted_gstate_sync.
>     - Remove the request state information that became superfluous due to this
>       patch, namely RQF_MQ_TIMEOUT_EXPIRED.
>     - Remove the code that became superfluous due to this change, namely
>       the RCU lock and unlock statements in blk_mq_complete_request() and
>       also the synchronize_rcu() call in the timeout handler.

Any chance you can try with the newer version?

https://github.com/bvanassche/linux/commit/4acd555fa13087

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux