On Sun, Jan 29, 2023 at 12:18:51PM +0000, Matthew Wilcox (Oracle) wrote: > Both f2fs and ext4 end up passing the ciphertext page to > wbc_account_cgroup_owner(). At the moment, the ciphertext page appears > to belong to no cgroup, so it is accounted to the root_mem_cgroup instead > of whatever cgroup the original page was in. > > It's hard to say how far back this is a bug. The crypto code shared > between ext4 & f2fs was created in May 2015 with commit 0b81d0779072, > but neither filesystem did anything with memcg_data before then. memcg > writeback accounting was added to ext4 in July 2015 in commit 001e4a8775f6 > and it wasn't added to f2fs until January 2018 (commit 578c647879f7). > > I'm going with the ext4 commit since this is the first commit where > there was a difference in behaviour between encrypted and unencrypted > filesystems. > > Fixes: 001e4a8775f6 ("ext4: implement cgroup writeback support") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> > --- > fs/crypto/crypto.c | 3 +++ > 1 file changed, 3 insertions(+) What is the actual effect of this bug? The bounce pages are short-lived, so surely it doesn't really matter what memory cgroup they get charged to? I guess it's really more about the effect on cgroup writeback? And that's also the reason why this is a problem here but not e.g. in dm-crypt? > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c > index e78be66bbf01..a4e76f96f291 100644 > --- a/fs/crypto/crypto.c > +++ b/fs/crypto/crypto.c > @@ -205,6 +205,9 @@ struct page *fscrypt_encrypt_pagecache_blocks(struct page *page, > } > SetPagePrivate(ciphertext_page); > set_page_private(ciphertext_page, (unsigned long)page); > +#ifdef CONFIG_MEMCG > + ciphertext_page->memcg_data = page->memcg_data; > +#endif > return ciphertext_page; > } Nothing outside mm/ and include/linux/memcontrol.h does anything with memcg_data directly. Are you sure this is the right thing to do here? Also, this patch causes the following: [ 16.192276] BUG: Bad page state in process kworker/u4:2 pfn:10798a [ 16.192919] page:00000000332f5565 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10798a [ 16.193848] memcg:ffff88810766c000 [ 16.194186] flags: 0x200000000000000(node=0|zone=2) [ 16.194642] raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000 [ 16.195356] raw: 0000000000000000 0000000000000000 00000000ffffffff ffff88810766c000 [ 16.196061] page dumped because: page still charged to cgroup [ 16.196599] CPU: 0 PID: 33 Comm: kworker/u4:2 Tainted: G T 6.2.0-rc5-00001-gf84eecbf5db1 #3 [ 16.197494] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.1-1-1 04/01/2014 [ 16.198343] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work [ 16.198899] Call Trace: [ 16.199143] <TASK> [ 16.199350] show_stack+0x47/0x56 [ 16.199670] dump_stack_lvl+0x55/0x72 [ 16.200019] dump_stack+0x14/0x18 [ 16.200345] bad_page.cold+0x5e/0x8a [ 16.200685] free_page_is_bad_report+0x61/0x70 [ 16.201111] free_pcp_prepare+0x13f/0x290 [ 16.201486] free_unref_page+0x27/0x1f0 [ 16.201848] __free_pages+0xa0/0xc0 [ 16.202186] mempool_free_pages+0xd/0x20 [ 16.202556] mempool_free+0x28/0x90 [ 16.202889] fscrypt_free_bounce_page+0x26/0x40 [ 16.203322] ext4_finish_bio+0x1ed/0x240 [ 16.203690] ext4_release_io_end+0x4a/0x100 [ 16.204088] ext4_end_io_rsv_work+0xa8/0x1b0 [ 16.204492] process_one_work+0x27f/0x580 [ 16.204874] worker_thread+0x5a/0x3d0 [ 16.205229] ? process_one_work+0x580/0x580 [ 16.205621] kthread+0x102/0x130 [ 16.205929] ? kthread_exit+0x30/0x30 [ 16.206280] ret_from_fork+0x1f/0x30 [ 16.206620] </TASK>