Re: [RFC PATCH 00/14] Rearrange batched folio freeing

Ryan Roberts <ryan.roberts@xxxxxxx> · Tue, 5 Sep 2023 14:26:54 +0100

On 05/09/2023 14:15, Matthew Wilcox wrote:
> On Mon, Sep 04, 2023 at 02:25:41PM +0100, Ryan Roberts wrote:
>> I've been doing some benchmarking of this series, as promised, but have hit an oops. It doesn't appear to be easily reproducible, and I'm struggling to figure out the root cause, so thought I would share in case you have any ideas?
> 
> I didn't hit that with my testing.  Admittedly I was using xfs rather
> than ext4, but ...

I've only seen it once.

I have a bit of a hybrid setup - my rootfs is xfs (and using large folios), but
the linux tree (which is being built during the benchmark) is on an ext4
partition. Large anon folios is enabled in this config, so there will be plenty
of large folios in the system.

I'm not sure if the fact that this fired from the ext4 path is too relevant -
the page with the dodgy index is already on the PCP list so may or may not be large.

> 
>>  UBSAN: array-index-out-of-bounds in mm/page_alloc.c:668:46
>>  index 10283 is out of range for type 'list_head [6]'
>>  pstate: 004000c9 (nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>  pc : free_pcppages_bulk+0x330/0x7f8
>>  lr : free_pcppages_bulk+0x7a8/0x7f8
>>  sp : ffff8000aeef3680
>>  x29: ffff8000aeef3680 x28: 000000000000282b x27: 00000000000000fc
>>  x26: 000000008015a39a x25: ffff08181ef9e840 x24: ffff0818836caf80
>>  x23: 0000000000000001 x22: 0000000000000000 x21: ffff08181ef9e850
>>  x20: fffffc200368e680 x19: fffffc200368e6c0 x18: 0000000000000000
>>  x17: 3d3d3d3d3d3d3d3d x16: 3d3d3d3d3d3d3d3d x15: 3d3d3d3d3d3d3d3d
>>  x14: 3d3d3d3d3d3d3d3d x13: 3d3d3d3d3d3d3d3d x12: 3d3d3d3d3d3d3d3d
>>  x11: 3d3d3d3d3d3d3d3d x10: 3d3d3d3d3d3d3d3d x9 : fffffc200368e688
>>  x8 : fffffc200368e680 x7 : 205d343737333639 x6 : ffff08181dee0000
>>  x5 : ffff0818836caf80 x4 : 0000000000000000 x3 : 0000000000000001
>>  x2 : ffff0818836f3330 x1 : ffff0818836f3230 x0 : 006808190c066707
>>  Call trace:
>>   free_pcppages_bulk+0x330/0x7f8
>>   free_unref_page_commit+0x15c/0x250
>>   free_unref_folios+0x37c/0x4a8
>>   release_unref_folios+0xac/0xf8
>>   folios_put+0xe0/0x1f0
>>   __folio_batch_release+0x34/0x88
>>   truncate_inode_pages_range+0x160/0x540
>>   truncate_inode_pages_final+0x58/0x90
>>   ext4_evict_inode+0x164/0x900
>>   evict+0xac/0x160
>>   iput+0x170/0x228
>>   do_unlinkat+0x1d0/0x290
>>   __arm64_sys_unlinkat+0x48/0x98
>>
>> UBSAN is complaining about migratetype being out of range here:
>>
>> /* Used for pages not on another list */
>> static inline void add_to_free_list(struct page *page, struct zone *zone,
>> 				    unsigned int order, int migratetype)
>> {
>> 	struct free_area *area = &zone->free_area[order];
>>
>> 	list_add(&page->buddy_list, &area->free_list[migratetype]);
>> 	area->nr_free++;
>> }
>>
>> And I think that is called from __free_one_page(), which is called
>> from free_pcppages_bulk() at the top of the stack trace. migratetype
>> originates from get_pcppage_migratetype(page), which is page->index. But
>> I can't see where this might be getting corrupted, or how yours or my
>> changes could affect this.
> 
> Agreed with your analysis.
> 
> My best guess is that page->index still contains the file index from
> when this page was in the page cache instead of being overwritten with
> the migratetype.  

Yeah that was my guess too. But I couldn't see how that was possible. So started
thinking it could be some separate corruption somehow...

> This is ext4, so large folios aren't in use.
> 
> I'll look more later, but I don't immediately see the problem.
>