Re: [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 4, 2023 at 11:19 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Limiting yourself to US-ASCII is at least technically valid. Because
> EBCDIC isn't worth worrying about.  But when the high bit is set, you
> had better not touch it, or you need to limit it spectacularly.

Side note: if limiting it to US-ASCII is fine (and it had better be,
because as mentioned, anything else will result in unresolvable
problems), you might look at using this as the pre-hash function:

    unsigned char prehash(unsigned char c)
    {
        unsigned char mask = (~(c >> 1) & c & 64) >> 1;
        return c & ~mask;
    }

which does modify a few other characters too, but nothing that matters
for hashing.

The advantage of the above is that you can trivially vectorize it. You
can do it with just regular integer math (64 bits = 8 bytes in
parallel), no need to use *actual* vector hardware.

The actual comparison needs to do the careful thing (because '~' and
'^' may hash to the same value, but obviously aren't the same), but
even there you can do a cheap "are these 8 characters _possibly_ the
same) with a very simple single 64-bit comparison, and only go to the
careful path if things match, ie

    /* Cannot possibly be equal even case-insentivitely? */
    if ((word1 ^ word2) & ~0x2020202020202020ul)
        continue;
    /* Ok, same in all but the 5th bits, go be careful */
    ....

and the reason I mention this is because I have been idly thinking
about supporting case-insensitivity at the VFS layer for multiple
decades, but have always decided that it's *so* nasty that I really
was hoping it just is never an issue in practice.

Particularly since the low-level filesystems then inevitably decide
that they need to do things wrong and need a locale, and at that point
all hope is lost.

I was hoping xfs would be one of the sane filesystems.

               Linus




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux