On Tue, Aug 23, 2016 at 03:14:23PM -0600, Jens Axboe wrote: > On 08/23/2016 03:11 PM, Jens Axboe wrote: > >My workload looks similar to yours, in that it's high depth and with a > >lot of jobs to keep most CPUs loaded. My bash script is different than > >yours, I'll try that and see if it helps here. > > Actually, I take that back. You're not using O_DIRECT, hence all your > jobs are running at QD=1, not the 256 specified. That looks odd, but > I'll try, maybe it'll hit something different. I haven't recreated this either, but I think I can logically see why this failure is happening. I sent an nvme driver patch earlier on this thread to exit the hardware context, which I thought would do the trick if the hctx's tags were being moved. That turns out to be wrong for a couple reasons. First, we can't release the nvmeq->tags when a hctx exits because that nvmeq may be used by other namespaces that need to point to the device's tag set. The other reason is that blk-mq doesn't exit or init hardware contexts when remapping for a CPU event, leaving the nvme driver unaware a hardware context points to a different tag set. So I think I see why this test would fail; don't know about a fix yet. Maybe the nvme driver needs some indirection instead of pointing directly to the tagset after init_hctx. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html