Re: f2fs: Introduce linear search for dentries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 03, 2025 at 03:07:10PM -0800, Daniel Rosenberg wrote:
> 
> The revert of the unicode patch is in all of the stable branches
> already. That f2fs patch is technically a fix for the revert as well,
> since the existence of either of those is a problem for the same
> reason :/
> 
> On Sat, Feb 1, 2025 at 9:06 AM Todd Kjos <tkjos@xxxxxxxxxx> wrote:
> >
> > Before we can bring back the reverted patch, we need the same fix for
> > ext4. Daniel, is there progress on that?

So I have a working fix for ext4, now but it's going to be a lot more
complicated if we want to bring back the reverted patch.  That's
because both e2fsprogs and f2fs-tools needs to be able to calculate
the hash used by the directories, and so fsck.ext4 and fsck.f2fs will
get confused if they run across file systems with file names which
were inserted while the reverted patch was in force.

I confirmed this was applicable for both ext4 and f2fs by modifying my
unicode-hijinks script to generate an f2fs image, and then running
fsck.f2fs on the image:

% /sbin/fsck.f2fs  /tmp/foo.img 
Info: MKFS version
  "Linux version 6.14.0-rc1-xfstests-00013-g30a8509ae0bb-dirty (tytso@cwcc) (gcc (Debian 14.2.0-8) 14.2.0, GNU ld (GNU Binutils for Debian) 2.43.50.20241210) #456 SMP PREEMPT_DYNAMIC Fri Feb  7 01:18:48 EST 2025"
Info: FSCK version
...
[FIX] (f2fs_check_hash_code:1471)  --> Mismatch hash_code for "❤️" [9a2ea068:19dd7132]
[FIX] (f2fs_check_hash_code:1471)  --> Mismatch hash_code for "❤️" [9a2ea068:19dd7132]

And of course, this happens with ext4 as well:

Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Problem in HTREE directory inode 4048: block #18 has bad max hash
Invalid HTREE directory inode 4048 (/I/no-E/H/red).  Clear HTree index? yes


Now, I'm not sure how much it's important to bring back the reverted
patch.  Yes, I know it's claimed that it fixes a "security issue", but
in my opinion, it's pretty bullshit worry.  First, almost no one uses
the case folded feature other than Android, and second, do you
*really* think someone will really be trying to run git under Termux
on their Pixel 9 Pro Fold?  I mean.... I guess; I do have Termux
installed on my P9PF, but even I'm not crazy enough to try install
git, emacs, gcc, etc., on an Android phone and expect to get aything
useful done.  Using ssh, or mosh, with Termux, sure.  But git?  Not
convinced....

Anyway, if we *do* want bring back the reverted patch, it would need
to be reworked so that there is a bit in the encoding flags which
indicates how we are treating Unicode "ignorable" characters, so that
e2fsprogs and f2fs-tools can do the right thing.  Once the kernel can
handle things with and without ignorable characters, on a switchable
basis based on a bit in the superblock, then we wouldn't need to use
the linear fallback hack, with the attendant performance penalty.

But honestly, I'm not sure it worth it.  But if someone sends me a
patch which handles the switchable unicode casefold, I'm willing to
spend time to get this integrated into e2fsprogs.

Cheers,

						- Ted

P.S.  This has only been tested using my a file system image created
using my unicode-hijinks script, but it hasn't gone through a full set
of regression tests yet.  But this it is doing the right thing at
least as far as the Unicode case folding is concerned.


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux