Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5

Mikulas Patocka <mpatocka@xxxxxxxxxx> · Mon, 30 Oct 2023 08:37:39 +0100 (CET)

On Sun, 29 Oct 2023, Vlastimil Babka wrote:

> Haven't found any. However I'd like to point out some things I noticed in
> crypt_alloc_buffer(), although they are probably not related.
> 
> > static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
> > {
> > 	struct crypt_config *cc = io->cc;
> > 	struct bio *clone;
> > 	unsigned int nr_iovecs = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > 	gfp_t gfp_mask = GFP_NOWAIT | __GFP_HIGHMEM;
> > 	unsigned int remaining_size;
> > 	unsigned int order = MAX_ORDER - 1;
> > 
> > retry:
> > 	if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM))
> > 		mutex_lock(&cc->bio_alloc_lock);
> 
> What if we end up in "goto retry" more than once? I don't see a matching

It is impossible. Before we jump to the retry label, we set 
__GFP_DIRECT_RECLAIM. mempool_alloc can't ever fail if 
__GFP_DIRECT_RECLAIM is present (it will just wait until some other task 
frees some objects into the mempool).

> unlock. Yeah, very unlikely to happen that order-0 in page allocator which
> includes __GFP_DIRECT_RECLAIM would fail, but not impossible, and also I see
> crypt_page_alloc() for the mempool can fail for another reason, due to a
> counter being too high. Looks dangerous.

If crypt_page_alloc fails, mempool falls back to allocating from a 
pre-allocated list.

But now I see that there is a bug that the compound pages don't contribute 
to the "cc->n_allocated_pages" counter. I'll have to fix it.

> > 
> > 	clone = bio_alloc_bioset(cc->dev->bdev, nr_iovecs, io->base_bio->bi_opf,
> > 				 GFP_NOIO, &cc->bs);
> > 	clone->bi_private = io;
> > 	clone->bi_end_io = crypt_endio;
> > 
> > 	remaining_size = size;
> > 
> > 	while (remaining_size) {
> > 		struct page *pages;
> > 		unsigned size_to_add;
> > 		unsigned remaining_order = __fls((remaining_size + PAGE_SIZE - 1) >> PAGE_SHIFT);
> 
> Tip: you can use get_order(remaining_size) here.

get_order rounds the size up and we need to round it down here (rounding 
it up would waste memory).

> > 		order = min(order, remaining_order);
> > 
> > 		while (order > 0) {
> 
> Is this intentionally > 0 and not >= 0? We could still succeed avoiding
> mempool with order-0...

Yes, it is intentional. mempool alloc will try to allocate the page using 
alloc_page, so there is no need to go to the "pages = alloc_pages" branch 
before it.

> > 			pages = alloc_pages(gfp_mask
> > 				| __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | __GFP_COMP,
> > 				order);
> > 			if (likely(pages != NULL))
> > 				goto have_pages;
> > 			order--;
> > 		}
> > 
> > 		pages = mempool_alloc(&cc->page_pool, gfp_mask);
> > 		if (!pages) {
> > 			crypt_free_buffer_pages(cc, clone);
> > 			bio_put(clone);
> > 			gfp_mask |= __GFP_DIRECT_RECLAIM;
> > 			order = 0;
> > 			goto retry;
> > 		}
> > 
> > have_pages:
> > 		size_to_add = min((unsigned)PAGE_SIZE << order, remaining_size);
> > 		__bio_add_page(clone, pages, size_to_add, 0);
> > 		remaining_size -= size_to_add;
> > 	}
> > 
> > 	/* Allocate space for integrity tags */
> > 	if (dm_crypt_integrity_io_alloc(io, clone)) {
> > 		crypt_free_buffer_pages(cc, clone);
> > 		bio_put(clone);
> > 		clone = NULL;
> > 	}
> > 
> > 	if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM))
> > 		mutex_unlock(&cc->bio_alloc_lock);
> > 
> > 	return clone;
> > }

Mikulas