On Mon, Feb 03, 2025 at 03:07:10PM -0800, Daniel Rosenberg wrote: > > The revert of the unicode patch is in all of the stable branches > already. That f2fs patch is technically a fix for the revert as well, > since the existence of either of those is a problem for the same > reason :/ > > On Sat, Feb 1, 2025 at 9:06 AM Todd Kjos <tkjos@xxxxxxxxxx> wrote: > > > > Before we can bring back the reverted patch, we need the same fix for > > ext4. Daniel, is there progress on that? So I have a working fix for ext4, now but it's going to be a lot more complicated if we want to bring back the reverted patch. That's because both e2fsprogs and f2fs-tools needs to be able to calculate the hash used by the directories, and so fsck.ext4 and fsck.f2fs will get confused if they run across file systems with file names which were inserted while the reverted patch was in force. I confirmed this was applicable for both ext4 and f2fs by modifying my unicode-hijinks script to generate an f2fs image, and then running fsck.f2fs on the image: % /sbin/fsck.f2fs /tmp/foo.img Info: MKFS version "Linux version 6.14.0-rc1-xfstests-00013-g30a8509ae0bb-dirty (tytso@cwcc) (gcc (Debian 14.2.0-8) 14.2.0, GNU ld (GNU Binutils for Debian) 2.43.50.20241210) #456 SMP PREEMPT_DYNAMIC Fri Feb 7 01:18:48 EST 2025" Info: FSCK version ... [FIX] (f2fs_check_hash_code:1471) --> Mismatch hash_code for "❤️" [9a2ea068:19dd7132] [FIX] (f2fs_check_hash_code:1471) --> Mismatch hash_code for "❤️" [9a2ea068:19dd7132] And of course, this happens with ext4 as well: Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Problem in HTREE directory inode 4048: block #18 has bad max hash Invalid HTREE directory inode 4048 (/I/no-E/H/red). Clear HTree index? yes Now, I'm not sure how much it's important to bring back the reverted patch. Yes, I know it's claimed that it fixes a "security issue", but in my opinion, it's pretty bullshit worry. First, almost no one uses the case folded feature other than Android, and second, do you *really* think someone will really be trying to run git under Termux on their Pixel 9 Pro Fold? I mean.... I guess; I do have Termux installed on my P9PF, but even I'm not crazy enough to try install git, emacs, gcc, etc., on an Android phone and expect to get aything useful done. Using ssh, or mosh, with Termux, sure. But git? Not convinced.... Anyway, if we *do* want bring back the reverted patch, it would need to be reworked so that there is a bit in the encoding flags which indicates how we are treating Unicode "ignorable" characters, so that e2fsprogs and f2fs-tools can do the right thing. Once the kernel can handle things with and without ignorable characters, on a switchable basis based on a bit in the superblock, then we wouldn't need to use the linear fallback hack, with the attendant performance penalty. But honestly, I'm not sure it worth it. But if someone sends me a patch which handles the switchable unicode casefold, I'm willing to spend time to get this integrated into e2fsprogs. Cheers, - Ted P.S. This has only been tested using my a file system image created using my unicode-hijinks script, but it hasn't gone through a full set of regression tests yet. But this it is doing the right thing at least as far as the Unicode case folding is concerned.