This is something I've been toying with yesterday, and got it cleaned up a bit just now. It's still very much a testing thing, and there's a few debug checks in there and items that need improving. Anyway, the idea here is that we can reduce the cost of getting a tag for a new request, if we don't get them piecemeal. Add a per-ctx tag cache, and grab batches of tags if it's empty. If it's not empty, we can just find a free bit there. /sys/kernel/debug/block/<dev>/<hctx>/<cpu>/tag_hit holds some stats associated with this, so you can check how it's doing. I've seen nice improvements with this in testing. -- Jens Axboe