Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5

Vlastimil Babka <vbabka@xxxxxxx> · Sun, 29 Oct 2023 21:02:26 +0100

On 10/25/23 12:13, Mikulas Patocka wrote:
> So, I forward this to memory management maintainers.

Hi,

> What do you think? - We have a problem that if dm-crypt allocates pages 
> with order > 3 (PAGE_ALLOC_COSTLY_ORDER), the system occasionally freezes 
> waiting in writeback.

Hmm, so I've checked the backtraces provided and none seems to show the
actual allocating task doing the crypt_alloc_buffer(), do we know what it's
doing? Is it blocked or perhaps spinning in some infinite loop?

> dm-crypt allocates the pages with GFP_NOWAIT | __GFP_HIGHMEM | 
> __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | __GFP_COMP, so it 
> shouldn't put any pressure on the system. If the allocations fails, it 
> falls back to smaller order allocation or to mempool as a last resort.

Right. I noticed it may also fallback to __GFP_DIRECT_RECLAIM but then it's
only order-0.

> When the freeze happens, there is "349264kB" free memory - so the system 
> is not short on memory.

Yeah, it may still be fragmented, although in [1] the sysqr show memory
report suggests it's not, pages do exist up to MAX_ORDER. Weird.

[1] https://lore.kernel.org/all/ZTiJ3CO8w0jauOzW@mail-itl/

> Should we restrict the dm-crypt allocation size to 
> PAGE_ALLOC_COSTLY_ORDER? Or is it a bug somewhere in memory management 
> system that needs to be fixes there?

Haven't found any. However I'd like to point out some things I noticed in
crypt_alloc_buffer(), although they are probably not related.

> static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
> {
> 	struct crypt_config *cc = io->cc;
> 	struct bio *clone;
> 	unsigned int nr_iovecs = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
> 	gfp_t gfp_mask = GFP_NOWAIT | __GFP_HIGHMEM;
> 	unsigned int remaining_size;
> 	unsigned int order = MAX_ORDER - 1;
> 
> retry:
> 	if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM))
> 		mutex_lock(&cc->bio_alloc_lock);

What if we end up in "goto retry" more than once? I don't see a matching
unlock. Yeah, very unlikely to happen that order-0 in page allocator which
includes __GFP_DIRECT_RECLAIM would fail, but not impossible, and also I see
crypt_page_alloc() for the mempool can fail for another reason, due to a
counter being too high. Looks dangerous.

> 
> 	clone = bio_alloc_bioset(cc->dev->bdev, nr_iovecs, io->base_bio->bi_opf,
> 				 GFP_NOIO, &cc->bs);
> 	clone->bi_private = io;
> 	clone->bi_end_io = crypt_endio;
> 
> 	remaining_size = size;
> 
> 	while (remaining_size) {
> 		struct page *pages;
> 		unsigned size_to_add;
> 		unsigned remaining_order = __fls((remaining_size + PAGE_SIZE - 1) >> PAGE_SHIFT);

Tip: you can use get_order(remaining_size) here.

> 		order = min(order, remaining_order);
> 
> 		while (order > 0) {

Is this intentionally > 0 and not >= 0? We could still succeed avoiding
mempool with order-0...

> 			pages = alloc_pages(gfp_mask
> 				| __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | __GFP_COMP,
> 				order);
> 			if (likely(pages != NULL))
> 				goto have_pages;
> 			order--;
> 		}
> 
> 		pages = mempool_alloc(&cc->page_pool, gfp_mask);
> 		if (!pages) {
> 			crypt_free_buffer_pages(cc, clone);
> 			bio_put(clone);
> 			gfp_mask |= __GFP_DIRECT_RECLAIM;
> 			order = 0;
> 			goto retry;
> 		}
> 
> have_pages:
> 		size_to_add = min((unsigned)PAGE_SIZE << order, remaining_size);
> 		__bio_add_page(clone, pages, size_to_add, 0);
> 		remaining_size -= size_to_add;
> 	}
> 
> 	/* Allocate space for integrity tags */
> 	if (dm_crypt_integrity_io_alloc(io, clone)) {
> 		crypt_free_buffer_pages(cc, clone);
> 		bio_put(clone);
> 		clone = NULL;
> 	}
> 
> 	if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM))
> 		mutex_unlock(&cc->bio_alloc_lock);
> 
> 	return clone;
> }