On Mon, Feb 03, 2020 at 01:15:58AM -0800, Christoph Hellwig wrote: > On Fri, Jan 31, 2020 at 04:53:41PM -0800, Satya Tangirala wrote: > > So I tried reading through more of blk-mq and the IO schedulers to figure > > out how to do this. As far as I can tell, requests may be merged with > > each other until they're taken off the scheduler. So ideally, we'd > > program a keyslot for a request when it's taken off the scheduler, but > > this happens from within an atomic context. Otoh, programming a keyslot > > might cause the thread to sleep (in the event that all keyslots are in use > > by other in-flight requests). Unless I'm missing something, or you had some > > other different idea in mind, I think it's a lot easier to stick to letting > > blk-crypto program keyslots and manage them per-bio... > > But as far as I understand from reading the code it only sleeps because > it waits for another key slot to be released. Which is exactly like > any other resource constraint in the storage device. In that case > ->queue_rq returns BLK_STS_RESOURCE (or you do the equivalent in the > blk-mq code) and the queue gets woken again once the resource is > available. Wouldn't that mean that all the other requests in the queue, even ones that don't even need any inline encryption, also don't get processed until the queue is woken up again? And if so, are we really ok with that? As you said, we'd need the queue to wake up once a keyslot is available. It's possible that only some hardware queues and not others get blocked because of keyslot programming, so ideally, we could somehow make the correct hardware queue(s) wake up once a keyslot is freed. But the keyslot manager can't assume that it's actually blk-mq that's being used underneath, so if we want to get the keyslot manager to do something once a keyslot was freed, it would need some generic way to signal that to blk-mq. We can also just wait around for the queue to restart by itself after some time delay and try to program the keyslot again at that point, although I wouldn't want to do that because in the current design we know exactly when a keyslot is freed, and we don't need to rely on potentially inefficient guesswork about when we can successfully program a keyslot. Maybe we're alright with waking up all the queues rather than only the ones that really need it? But in any case, I don't know yet what the best way to solve this problem is. We would also need to make changes to handle programming keyslots in some of the other make_request_fns besides blk_mq_make_request too (wherever relevant, at least) which adds more complexity. Overall, it seems to me like trying to manage programming of keyslots on a per-request basis is maybe more code than what we have now, and I'm not sure what we're really buying by doing it (other than perhaps the performance benefit of having to get fewer refcounts on a variable and fewer comparisions of cryptographic keys). Also I forgot to mention this in my previous mail, but there may be some drivers/devices whose keyslots cannot be programmed from an atomic context, so this approach which might make things difficult in those situations (the UFS v2.1 spec, which I followed while implementing support for inline crypto for UFS, does not care whether we're in an atomic context or not, but there might be specifications for other drivers, or even some particular UFS inline encryption hardware that do). So unless you have strong objections, I'd want to continue programming keyslots per-bio for the above reasons.