On Wed, Sep 06, 2023 at 01:38:23PM +0100, Matthew Wilcox wrote: > > Is this code path a possibility, which can cause above logs? > > > > ptr = jbd2_alloc() -> kmem_cache_alloc() > > <..> > > new_folio = virt_to_folio(ptr) > > new_offset = offset_in_folio(new_folio, ptr) > > > > And then I am still not sure what the problem really is? > > Is it because at the time of checkpointing, the path is still not fully > > converted to folio? > > Oh yikes! I didn't know that the allocation might come from kmalloc! > Yes, slab might use high-order allocations. I'll have to look through > this and figure out what the problem might be. I think the probable cause is bh_offset(). Before these patches, if we allocated a buffer at offset 9kB into an order-2 slab, we'd fill in b_page with the third page of the slab and calculate bh_offset as 1kB. With these patches, we set b_page to the first page of the slab, and bh_offset still comes back as 1kB so we read from / write to entirely the wrong place. With this redefinition of bh_offset(), we calculate the offset relative to the base page if it's a tail page, and relative to the folio if it's a folio. Works out nicely ;-) I have three other things I'm trying to debug right now, so this isn't tested, but if you have time you might want to give it a run. diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index 6cb3e9af78c9..dc8fcdc40e95 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -173,7 +173,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh) return test_bit_acquire(BH_Uptodate, &bh->b_state); } -#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~PAGE_MASK) +static inline unsigned long bh_offset(struct buffer_head *bh) +{ + return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1); +} /* If we *know* page->private refers to buffer_heads */ #define page_buffers(page) \