> On Oct 19, 2018, at 11:02 PM, Andreas Dilger <adilger@xxxxxxxxx> wrote: > >> On Oct 18, 2018, at 11:26 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote: >> >>> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown <neilb@xxxxxxxx> wrote: >>> >>>> On Wed, Oct 17 2018, Andy Lutomirski wrote: >>>> >>>>> On Wed, Oct 17, 2018 at 6:48 PM NeilBrown <neilb@xxxxxxxx> wrote: >>>>> >>>>> >>>>> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() >>>>>> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote: >>>>>> >>>>>> Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae >>>>>> Gitweb: http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae >>>>>> Author: Dmitry Safonov <dsafonov@xxxxxxxxxxxxx> >>>>>> AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300 >>>>>> Committer: Ingo Molnar <mingo@xxxxxxxxxx> >>>>>> CommitDate: Tue, 19 Apr 2016 10:44:52 +0200 >>>>>> >>>>>> x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall() >>>>>> >>>>> ... >>>>>> @@ -318,7 +318,7 @@ static inline bool is_x32_task(void) >>>>>> >>>>>> static inline bool in_compat_syscall(void) >>>>>> { >>>>>> - return is_ia32_task() || is_x32_task(); >>>>>> + return in_ia32_syscall() || in_x32_syscall(); >>>>>> } >>>>> >>>>> Hi, >>>>> I'm reply to this patch largely to make sure I get the right people >>>>> ..... >>>>> >>>>> This test is always true when CONFIG_X86_32 is set, as that forces >>>>> in_ia32_syscall() to true. >>>>> However we might not be in a syscall at all - we might be running a >>>>> kernel thread which is always in 64 mode. >>>>> Every other implementation of in_compat_syscall() that I found is >>>>> dependant on a thread flag or syscall register flag, and so returns >>>>> "false" in a kernel thread. >>>>> >>>>> Might something like this be appropriate? >>>>> >>>>> diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h >>>>> index 2ff2a30a264f..c265b40a78f2 100644 >>>>> --- a/arch/x86/include/asm/thread_info.h >>>>> +++ b/arch/x86/include/asm/thread_info.h >>>>> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * const stack, >>>>> #ifndef __ASSEMBLY__ >>>>> >>>>> #ifdef CONFIG_X86_32 >>>>> -#define in_ia32_syscall() true >>>>> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD)) >>>>> #else >>>>> #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \ >>>>> current_thread_info()->status & TS_COMPAT) >>>>> >>>>> This came up in the (no out-of-tree) lustre filesystem where some code >>>>> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel >>>>> threads. >>>>> >>>> >>>> I could get on board with: >>>> >>>> ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true}) >>>> >>>> The point of these accessors is to be used *in a syscall*. >>>> >>>> What on Earth is Lustre doing that makes it have this problem? >>> >>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and >>> ->rdev are appropriately sized. This isn't very different from the >>> usage in ext4 to ensure the seek offset for directories is suitable. >>> >>> These interfaces can be used both from systemcalls and from kernel >>> threads, such as via nfsd. >>> >>> I don't *know* if nfsd is the particular kthread that causes problems >>> for lustre. All I know is that ->getattr returns 32bit squashed inode >>> numbers in kthread context where 64 bit numbers would be expected. >>> >> >> Well, that looks like Lustre is copying an ext4 bug. >> >> Hi ext4 people- >> >> ext4's is_32bit_api() function is bogus. You can't use >> in_compat_syscall() unless you know you're in a syscall >> >> The buggy code was introduced in: >> >> commit d1f5273e9adb40724a85272f248f210dc4ce919a >> Author: Fan Yong <yong.fan@xxxxxxxxxxxxx> >> Date: Sun Mar 18 22:44:40 2012 -0400 >> >> ext4: return 32/64-bit dir name hash according to usage type >> >> I don't know what the right solution is. Al, is it legit at all for >> fops->llseek to care about the caller's bitness? If what ext4 is >> doing is legit, then ISTM the VFS needs to gain a new API to tell >> ->llseek what to do. But I'm wondering why FMODE_64BITHASH by itself >> isn't sufficient, >> >> I'm quite tempted to add a warning to the x86 arch code to try to >> catch this type of bug. Fortunately, a bit of grepping suggests that >> ext4 is the only filesystem with this problem. > > We need to know whether the readdir cookie returned to userspace > should be a 32-bit cookie or a 64-bit cookie. Trying to return > a 64-bit value will result in -EOVERFLOW for a 32-bit application, > but is preferable (if possible) because it reduces the chance of > hash collisions causing readdir to have problems. > > Let’s rope Al in. Sorry, I thought he was already on cc. The concept seems reasonable, but the implementation is problematic. For example, the behavior of calling vfs_llseek() is basically undefined. Is some VFS change needed to fix this? Maybe a .compat_llseek or some other explicit indication of whether a 64-bit hash is okay?