On Wed, Feb 19, 2020 at 2:23 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > On Wed, Feb 19, 2020 at 09:54:31AM -0800, Salman Qazi wrote: > > On Tue, Feb 18, 2020 at 6:55 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > > > > > On Tue, Feb 18, 2020 at 08:11:53AM -0800, Jesse Barnes wrote: > > > > On Fri, Feb 14, 2020 at 7:47 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > > > > What are the 'other operations'? Are they block IOs? > > > > > > > > > > If yes, that is why I suggest to fix submit_bio_wait(), which should cover > > > > > most of sync bio submission. > > > > > > > > > > Anyway, the fix is simple & generic enough, I'd plan to post a formal > > > > > patch if no one figures out better doable approaches. > > > > > > > > Yeah I think any block I/O operation that occurs after the > > > > BLKSECDISCARD is submitted will also potentially be affected by the > > > > hung task timeouts, and I think your patch will address that. My only > > > > concern with it is that it might hide some other I/O "hangs" that are > > > > due to device misbehavior instead. Yes driver and device timeouts > > > > should generally catch those, but with this in place we might miss a > > > > few bugs. > > > > > > > > Given the nature of these types of storage devices though, I think > > > > that's a minor issue and not worth blocking the patch on, given that > > > > it should prevent a lot of false positive hang reports as Salman > > > > demonstrated. > > > > > > Hello Jesse and Salman, > > > > > > One more question about this issue, do you enable BLK_WBT on your test > > > kernel? > > > > It doesn't exist on the original 4.4-based kernel where we reproduced > > this bug. I am curious how this interacts with this bug. > > blk-wbt can throttle discard request and keep discard queue not too > deep. > > However, given block erase is involved in BLKSECDISCARD, I guess blk-wbt > may not avoid this task hung issue completely. Thanks for the explanation. As I said, it takes one 4K BLKSECDISCARD to get to 100 second delay where the entire device is unusable for that time. So, the queue doesn't have to be deep at all. It's a single tiny IOCTL. > > > Thanks, > Ming >