Re: [PATCH 1/4] bio: add allocation cache abstraction

Jens Axboe <axboe@xxxxxxxxx> · Tue, 10 Aug 2021 09:35:04 -0600

On 8/10/21 8:48 AM, Jens Axboe wrote:
> On 8/10/21 8:24 AM, Jens Axboe wrote:
>> On 8/10/21 7:53 AM, Jens Axboe wrote:
>>> On 8/10/21 7:15 AM, Ming Lei wrote:
>>>> Hi Jens,
>>>>
>>>> On Mon, Aug 09, 2021 at 03:23:58PM -0600, Jens Axboe wrote:
>>>>> Add a set of helpers that can encapsulate bio allocations, reusing them
>>>>> as needed. Caller must provide the necessary locking, if any is needed.
>>>>> The primary intended use case is polled IO from io_uring, which will not
>>>>> need any external locking.
>>>>>
>>>>> Very simple - keeps a count of bio's in the cache, and maintains a max
>>>>> of 512 with a slack of 64. If we get above max + slack, we drop slack
>>>>> number of bio's.
>>>>>
>>>>> The cache is intended to be per-task, and the user will need to supply
>>>>> the storage for it. As io_uring will be the only user right now, provide
>>>>> a hook that returns the cache there. Stub it out as NULL initially.
>>>>
>>>> Is it possible for user space to submit & poll IO from different io_uring
>>>> tasks?
>>>>
>>>> Then one bio may be allocated from bio cache of the submission task, and
>>>> freed to cache of the poll task?
>>>
>>> Yes that is possible, and yes that would not benefit from this cache
>>> at all. The previous version would work just fine with that, as the
>>> cache is just under the ring lock and hence you can share it between
>>> tasks.
>>>
>>> I wonder if the niftier solution here is to retain the cache in the
>>> ring still, yet have the pointer be per-task. So basically the setup
>>> that this version does, except we store the cache itself in the ring.
>>> I'll give that a whirl, should be a minor change, and it'll work per
>>> ring instead then like before.
>>
>> That won't work, as we'd have to do a ctx lookup (which would defeat the
>> purpose), and we don't even have anything to key off of at that point...
>>
>> The current approach seems like the only viable one, or adding a member
>> to kiocb so we can pass in the cache in question. The latter did work
>> just fine, but I really dislike the fact that it's growing the kiocb to
>> more than a cacheline.
> 
> One potential way around this is to store the bio cache pointer in the
> iov_iter. Each consumer will setup a new struct to hold the bio etc, so
> we can continue storing it in there and have it for completion as well.
> 
> Upside of that is that we retain the per-ring cache, instead of
> per-task, and iov_iter has room to hold the pointer without getting near
> the cacheline size yet.
> 
> The downside is that it's kind of odd to store it in the iov_iter, and
> I'm sure that Al would hate it. Does seem like the best option though,
> in terms of getting the storage for the cache "for free".

Here's that approach:

https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-bio-cache.3

totally untested so far.

-- 
Jens Axboe