> On 6 Sep 2017, at 17.20, Jens Axboe <axboe@xxxxxxxxx> wrote: > > On 09/06/2017 09:13 AM, Jens Axboe wrote: >> On 09/06/2017 09:12 AM, Javier González wrote: >>>> On 6 Sep 2017, at 17.09, Jens Axboe <axboe@xxxxxxxxx> wrote: >>>> >>>> On 09/06/2017 09:08 AM, Johannes Thumshirn wrote: >>>>> On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier González wrote: >>>>>> Check for failed mempool allocations and act accordingly. >>>>> >>>>> Are you sure it is needed? Quoting from mempool_alloc()s Documentation: >>>>> "[...] Note that due to preallocation, this function *never* fails when called >>>>> from process contexts. (it might fail if called from an IRQ context.) [...]" >>>> >>>> It's not needed, mempool() will never fail if __GFP_WAIT is set in the >>>> mask. The use case here is GFP_KERNEL, which does include __GFP_WAIT. >>> >>> Thanks for the clarification. Do you just drop the patch, or do you want >>> me to re-send the series? >> >> No need to resend. I'll pick up the others in a day or two, once people >> have had some time to go over them. > > I took a quick look at your mempool usage, and I'm not sure it's > correct. For a mempool to work, you have to ensure that you provide a > forward progress guarantee. With that guarantee, you know that if you do > end up sleeping on allocation, you already have items inflight that will > be freed when that operation completes. In other words, all allocations > must have a defined and finite life time, as any allocation can > potentially sleep/block for that life time. You can't allocate something > and hold on to it forever, then you are violating the terms of agreement > that makes a mempool work. I understood the part of guaranteeing the number of inflight items to keep the mempool active without waiting, but I must admit that I assumed that the mempool would resize when getting pressure and that the penalty would be increased latency, not the mempool giving up and causing a deadlock. > > The first one that caught my eye is pblk->page_pool. You have this loop: > > for (i = 0; i < nr_pages; i++) { > page = mempool_alloc(pblk->page_pool, flags); > if (!page) > goto err; > > ret = bio_add_pc_page(q, bio, page, PBLK_EXPOSED_PAGE_SIZE, 0); > if (ret != PBLK_EXPOSED_PAGE_SIZE) { > pr_err("pblk: could not add page to bio\n"); > mempool_free(page, pblk->page_pool); > goto err; > } > } > > which looks suspect. This mempool is created with a reserve pool of > PAGE_POOL_SIZE (16) members. Do we know if the bio has 16 pages or less? > If not, then this is broken and can deadlock forever. I can see that in this case, the 16 elements do not hold. In the read path, we can guarantee that a read will be <= 64 sectors (4KB pages), so this is definitely wrong. I'll fix it tomorrow. Since we are at it, I have for some time wondered what's the right way to balance the mempool sizes so that we are a good citizen and perform at the same time. I don't see a lot of code using mempool_resize to tune the min_nr based on load. For example, in our write path, the numbers are easy to calculate, but on the read path I completely over-dimensioned the mempool to ensure not having to wait for the completion path. Any good rule of thumb here? > You have a lot of mempool usage in the code, would probably not hurt to > audit all of them. Yes. I will take a look and add comments to the sizes. Thanks Jens, Javier
Attachment:
signature.asc
Description: Message signed with OpenPGP