On Fri, Nov 04, 2016 at 05:51:05PM -0400, Theodore Ts'o wrote: > On Fri, Nov 04, 2016 at 10:14:03AM -0600, Andreas Dilger wrote: > > > 2. In ext4_lookup(), if case insensitivity is enabled, and the > > > directory lookup does not succeed, fall back to a linear search of the > > > directory using using a case insensitive compare. (This is slow, but > > > it's faster compared to doing this in userspace). > > > > Does it make sense to flag directories with whether entries are inserted > > with the case-insensitive hash? That allows the common case of having > > case insensitivity always enabled or disabled working optimally. Falling > > back to linear search for every negative lookup would be prohibitive for > > large directories. > > I'm proposing that we not make any on-disk format changes for now. > It's true that this means that we need to degrade to a O(N) brute > force search, and that it is undefined if there are two files that are > the same when case folding is enabled (e.g., if there is both a > Makefile and makefile in the directory). FYI, avoiding having to degrade to brute-force searches is why XFS added a mkfs option for ascii-ci support. It is there to indicate that the directory name hashes are lower-case, case-insensitive hashes on disk. This means that all case versions of the filename hash to the same value and collisions can be resolved without changing any of the existing search code. We did this with a simple abstraction: static struct xfs_nameops xfs_ascii_ci_nameops = { .hashname = xfs_ascii_ci_hashname, .compname = xfs_ascii_ci_compname, }; Where ->hashname() calculates the hash, and ->compname() compares the hash on disk for a match during lookup. Otherwise, the only difference is the lookup path to instantiate the dentry differently depending on whether it was an exact match or CI match (see xfs_vn_ci_lookup()). As on-disk changes go, this one should be relatively simple as there is no actual structural change. :P > If someone wants to do something "right", which means e2fsprogs and > kernel changes, getting the Unicode translation code into the kernel > (and dealing with the bikeshedding that will probably happen when we > try to get generic Unicode support into the kernel), and that someone Already happened once with an attempt to get unicode case folding into XFS. Unfortunately, SGI disappeared before review was completed and so it never got finalised and merged. However, the code is out there and so we have pretty much a full implementation of unicode case folding code out there. The v3 RFC (which contains links back to the previous two versions and discussions) can be found here: http://oss.sgi.com/archives/xfs/2014-10/msg00067.html That's the place to start if people want to pick this up - I'd suggest a generic interface similar to what has been done with the fs encryption code is the way to proceed with this.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html