On Sun, Nov 12, 2023 at 9:10 AM Ryusuke Konishi wrote: > > On Tue, Nov 7, 2023 at 10:49 AM Ryusuke Konishi wrote: > > > > On Tue, Nov 7, 2023 at 2:39 AM Matthew Wilcox (Oracle) wrote: > > > > > > This patch series does most of the page->folio conversions needed in > > > nilfs2. I haven't done the work to support large folios in nilfs2; > > > I don't know if that conversion will be worth the effort. There are > > > still a few page uses left, but the infrastructure isn't quite there to > > > get rid of them yet. > > > > > > Arguably, this is two separate series; the first takes care of the file > > > paths and the second takes care of directories. I've tried my best to > > > include large folio support in the directory code because it'll be needed > > > for large block size devices. It also tries to stay as close as possible > > > to the current ext2 code (so it also includes kmap_local support). > > > > > > These patches are only compile-tested. xfstests doesn't seem to know > > > about nilfs2. > > > > > > Matthew Wilcox (Oracle) (35): > > > nilfs2: Add nilfs_end_folio_io() > > > nilfs2: Convert nilfs_abort_logs to use folios > > > nilfs2: Convert nilfs_segctor_complete_write to use folios > > > nilfs2: Convert nilfs_forget_buffer to use a folio > > > nilfs2: Convert to nilfs_folio_buffers_clean() > > > nilfs2: Convert nilfs_writepage() to use a folio > > > nilfs2: Convert nilfs_mdt_write_page() to use a folio > > > nilfs2: Convert to nilfs_clear_folio_dirty() > > > nilfs2: Convert to __nilfs_clear_folio_dirty() > > > nilfs2: Convert nilfs_segctor_prepare_write to use folios > > > nilfs2: Convert nilfs_page_mkwrite() to use a folio > > > nilfs2: Convert nilfs_mdt_create_block to use a folio > > > nilfs2: Convert nilfs_mdt_submit_block to use a folio > > > nilfs2: Convert nilfs_gccache_submit_read_data to use a folio > > > nilfs2: Convert nilfs_btnode_create_block to use a folio > > > nilfs2: Convert nilfs_btnode_submit_block to use a folio > > > nilfs2: Convert nilfs_btnode_delete to use a folio > > > nilfs2: Convert nilfs_btnode_prepare_change_key to use a folio > > > nilfs2: Convert nilfs_btnode_commit_change_key to use a folio > > > nilfs2: Convert nilfs_btnode_abort_change_key to use a folio > > > nilfs2: Remove page_address() from nilfs_set_link > > > nilfs2: Remove page_address() from nilfs_add_link > > > nilfs2: Remove page_address() from nilfs_delete_entry > > > nilfs2: Return the mapped address from nilfs_get_page() > > > nilfs2: Pass the mapped address to nilfs_check_page() > > > nilfs2: Switch to kmap_local for directory handling > > > nilfs2: Add nilfs_get_folio() > > > nilfs2: Convert nilfs_readdir to use a folio > > > nilfs2: Convert nilfs_find_entry to use a folio > > > nilfs2: Convert nilfs_rename() to use folios > > > nilfs2: Convert nilfs_add_link() to use a folio > > > nilfs2: Convert nilfs_empty_dir() to use a folio > > > nilfs2: Convert nilfs_make_empty() to use a folio > > > nilfs2: Convert nilfs_prepare_chunk() and nilfs_commit_chunk() to > > > folios > > > nilfs2: Convert nilfs_page_bug() to nilfs_folio_bug() > > > > > > fs/nilfs2/btnode.c | 62 +++++------ > > > fs/nilfs2/dir.c | 248 ++++++++++++++++++++------------------------ > > > fs/nilfs2/file.c | 28 ++--- > > > fs/nilfs2/gcinode.c | 4 +- > > > fs/nilfs2/inode.c | 11 +- > > > fs/nilfs2/mdt.c | 23 ++-- > > > fs/nilfs2/namei.c | 33 +++--- > > > fs/nilfs2/nilfs.h | 20 ++-- > > > fs/nilfs2/page.c | 93 +++++++++-------- > > > fs/nilfs2/page.h | 12 +-- > > > fs/nilfs2/segment.c | 157 ++++++++++++++-------------- > > > 11 files changed, 338 insertions(+), 353 deletions(-) > > > > > > -- > > > 2.42.0 > > > > > > > Matthew, thank you so much for this hard work. > > Even if full support for large folios cannot be achieved at this time > > due to limitations in the nilfs2 implementation, I appreciate that you > > are moving forward with the conversion work that should be done. > > > > I haven't reviewed each patch yet, but at least this series can be > > built without problems in my environment too, and so far it is working > > fine including GC and stress tests. > > > > I will review all the patches, but since there are so many, I will not > > add LGTM replies to each one, but will only reply to those that have > > comments (if any). > > > > Many thanks, > > Ryusuke Konishi > > The following WARNING was detected during stress testing on a 32-bit VM: > > [ 270.894814][ T5828] ------------[ cut here ]------------ > [ 270.895409][ T5828] WARNING: CPU: 1 PID: 5828 at mm/highmem.c:611 > kunmap_local_indexed+0xd4/0xfc > <snip> > [ 270.904260][ T5828] EIP: kunmap_local_indexed+0xd4/0xfc > [ 270.904940][ T5828] Code: 00 02 8b 80 5c 0e 00 00 85 c0 78 26 b8 01 > 00 00 00 e8 80 22 df ff 64 a1 84 29 33 c2 85 c0 74 1a e8 75 f4 df ff > 5b 5e 5d c3 90 <0f> 0b eb 95 8d 74 26 00 0f 0b 8d b6 00 00 00 00 e8 13 > 8a 80 00 eb > [ 270.907264][ T5828] EAX: 00000024 EBX: fff99000 ECX: 00068000 EDX: fff97000 > [ 270.908140][ T5828] ESI: 00000003 EDI: f6cc76c0 EBP: e353fda8 ESP: e353fda0 > [ 270.909020][ T5828] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > EFLAGS: 00010206 > [ 270.910043][ T5828] CR0: 80050033 CR2: b145b49c CR3: 23570000 CR4: 00350ed0 > [ 270.910927][ T5828] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [ 270.911799][ T5828] DR6: fffe0ff0 DR7: 00000400 > [ 270.912369][ T5828] Call Trace: > [ 270.912771][ T5828] ? show_regs+0x50/0x58 > [ 270.913287][ T5828] ? kunmap_local_indexed+0xd4/0xfc > [ 270.913908][ T5828] ? __warn+0x6f/0x184 > [ 270.914420][ T5828] ? kunmap_local_indexed+0xd4/0xfc > [ 270.915063][ T5828] ? report_bug+0x1b2/0x22c > [ 270.915610][ T5828] ? timers_dead_cpu+0x12b/0x268 > [ 270.916249][ T5828] ? exc_overflow+0x38/0x38 > [ 270.916826][ T5828] ? handle_bug+0x2a/0x48 > [ 270.917353][ T5828] ? exc_invalid_op+0x1b/0x58 > [ 270.917929][ T5828] ? handle_exception+0x130/0x130 > [ 270.918513][ T5828] ? shrink_dentry_list+0x73/0x2bc > [ 270.919121][ T5828] ? exc_overflow+0x38/0x38 > [ 270.919728][ T5828] ? kunmap_local_indexed+0xd4/0xfc > [ 270.920369][ T5828] ? exc_overflow+0x38/0x38 > [ 270.920913][ T5828] ? kunmap_local_indexed+0xd4/0xfc > [ 270.921545][ T5828] nilfs_delete_entry+0xa7/0x1ec [nilfs2] > [ 270.922255][ T5828] nilfs_rename+0x359/0x374 [nilfs2] > [ 270.922899][ T5828] ? find_held_lock+0x24/0x70 > [ 270.923457][ T5828] ? down_write_nested+0x6d/0xd0 > [ 270.924043][ T5828] vfs_rename+0x525/0xaa8 > [ 270.924572][ T5828] ? vfs_rename+0x525/0xaa8 > [ 270.925156][ T5828] ? security_path_rename+0x54/0x7c > [ 270.925794][ T5828] do_renameat2+0x496/0x504 > [ 270.926380][ T5828] __ia32_sys_rename+0x34/0x3c > [ 270.926973][ T5828] __do_fast_syscall_32+0x56/0xc8 > [ 270.927598][ T5828] do_fast_syscall_32+0x29/0x58 > [ 270.928257][ T5828] do_SYSENTER_32+0x15/0x18 > [ 270.928871][ T5828] entry_SYSENTER_32+0x98/0xf1 > [ 270.929582][ T5828] EIP: 0xb146f579 > [ 270.930064][ T5828] Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 > 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 > e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd > 80 90 8d 76 > [ 270.932584][ T5828] EAX: ffffffda EBX: 03ac7650 ECX: 03ac76f0 EDX: b1456ff4 > [ 270.933446][ T5828] ESI: 02aae2c3 EDI: b14a4b80 EBP: bffd8638 ESP: bffd7de8 > [ 270.934311][ T5828] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b > EFLAGS: 00000292 > [ 270.935245][ T5828] Kernel panic - not syncing: kernel: panic_on_warn set ... > > This issue is reproducible and the result of bisect was the following patch: > > > > nilfs2: Switch to kmap_local for directory handling > > It seems that the problem was introduced in the conversion of > "nilfs_rename() -> nilfs_delete_entry()" to kmap_local. > > For the first part of this series (PATH 01/35 - 20/35), my review is > already finished, and I believe nothing breaks existing behavior. > So I'm thinking of sending that part to the -mm tree first (on Monday > or Tuesday), but if you have any opinions, please let me know. > > For the rest, I would like to continue problem analysis and review. > > Thanks, > Ryusuke Konishi The cause of this issue was that in the current implementation of the rename function, there was a hidden part where the use of kmap-kunmap pairs for two pages was not nested. Other than that, there were no problems with the second half of the patchset in the review, so I would like to insert a preparation patch to resolve the order reversal of kunmap (more precisely, nilfs_put_page) calls, so that they can be converted directly to unmap_and_put_page (and then to folio_release_kmap) calls, and send the remaining patches together to the mm tree with only minor adjustments. Regards, Ryusuke Konishi