On 12/14/2016 01:48 PM, Gabriel Krisman Bertazi wrote: > From: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxxxxxxx> > > In blk_mq_map_swqueue, there is a memory optimization that frees the > tags of a queue that has gone unmapped. Later, if that hctx is remapped > after another topology change, the tags need to be reallocated. > > If this allocation fails, a simple WARN_ON triggers, but the block layer > ends up with an active hctx without any corresponding set of tags. > Then, any income IO to that hctx can trigger an Oops. > > I can reproduce it consistently by running IO, flipping CPUs on and off > and eventually injecting a memory allocation failure in that path. > > In the fix below, if the system experiences a failed allocation of any > hctx's tags, we remap all the ctxs of that queue to the hctx_0, which > should always keep it's tags. There is a minor performance hit, since > our mapping just got worse after the error path, but this is > the simplest solution to handle this error path. The performance hit > will disappear after another successful remap. > > I considered dropping the memory optimization all together, but it > seemed a bad trade-off to handle this very specific error case. Thanks, this looks fine to me now. Both of your patches are queued up for inclusion in 4.10. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html