Hello, Bart. On Mon, Apr 09, 2018 at 09:30:27PM +0000, Bart Van Assche wrote: > On Mon, 2018-04-09 at 11:56 -0700, tj@xxxxxxxxxx wrote: > > On Mon, Apr 09, 2018 at 05:03:05PM +0000, Bart Van Assche wrote: > > > exist today in the blk-mq timeout handling code cannot be fixed completely > > > using RCU only. > > > > I really don't think that is that complicated. Let's first confirm > > the race fix and get to narrowing / closing that window. > > Two months ago it was reported for the first time that commit 1d9bd5161ba3 > ("blk-mq: replace timeout synchronization with a RCU and generation based > scheme") introduces a regression. Since that report nobody has posted a > patch that fixes all races related to blk-mq timeout handling and that only The two patches using RCU were posted a long time ago. It was just that the repro that only you had at the time didn't work anymore so we couldn't confirm the fix. If we now have a different repro, awesome. Let's see whether the fix works. > uses RCU. If you want to continue working on this that's fine with me. But > since my opinion is that it is impossible to fix these races using RCU only > I will continue working on an alternative approach. See also "[PATCH] > blk-mq: Fix a race between resetting the timer and completion handling" > (https://www.mail-archive.com/linux-block@xxxxxxxxxxxxxxx/msg18089.html). ISTR discussing that patch earlier. Didn't the RCU based fix get posted after that discussion? Thanks. -- tejun