Re: linux-next: build failure after merge of the bpf-next tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 20, 2025 at 12:17 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Wed, Mar 19, 2025 at 12:44 PM Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> >
> > On Wed, Mar 19, 2025 at 7:56 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Wed, Mar 19, 2025 at 9:06 AM Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> > > >
> > > > On Wed, Mar 19, 2025 at 3:55 PM Alexei Starovoitov
> > > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, Mar 19, 2025 at 7:36 AM Kumar Kartikeya Dwivedi
> > > > > <memxor@xxxxxxxxx> wrote:
> > > > > >
> > > > > > > >
> > > > > > > > I've sent a fix [0], but unfortunately I was unable to reproduce the
> > > > > > > > problem with an LLVM >= 19 build, idk why. I will try with GCC >= 14
> > > > > > > > as the patches require to confirm, but based on the error I am 99%
> > > > > > > > sure it will fix the problem.
> > > > > > >
> > > > > > > Probably because __seg_gs has CC_HAS_NAMED_AS depends on CC_IS_GCC.
> > > > > > > Let me give it a go with GCC.
> > > > > > >
> > > > > >
> > > > > > Can confirm now that this fixes it, I just did a build with GCC 14
> > > > > > where Uros's __percpu checks kick in.
> > > > >
> > > > > Great. Thanks for checking and quick fix.
> > > > >
> > > > > btw clang supports it with __attribute__((address_space(256))),
> > > > > so CC_IS_GCC probably should be relaxed.
> > > >
> > > > https://github.com/llvm/llvm-project/issues/93449
> > > >
> > > > needs to be fixed first. Also, the feature has to be thoroughly tested
> > > > (preferably by someone having a deep knowledge of clang) before it is
> > > > enabled by default.
> > >
> > > clang error makes sense to me.
> >
> > It is not an error, but an internal compiler error. This should never happen.
>
> Not quite. llvm backends don't have a good way to explain the error,
> but this is invalid condition.
> Arguably llvm should do a better job in such cases instead of
> printing stack trace.
>
> >
> > > What does it even mean to do addr space cast from percpu to normal address:
> > >
> > > __typeof__(int __seg_gs) const_pcpu_hot;
> > > void *__attribute____UNIQUE_ID___addressable_const_pcpu_hot612 =
> > >     (void *)(long)&const_pcpu_hot;
> >
> > Please see [1] for an explanation.
> >
> > [1] https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html#x86-Named-Address-Spaces
>
> You didn't answer my question.

Actually, the above link explains and documents the issue:

"... these address spaces are not considered to be subspaces of the
generic (flat) address space. This means that explicit casts are
required to convert pointers between these address spaces and the
generic address space. In practice the application should cast to
uintptr_t and apply the segment base offset that it installed
previously."

IOW, for __seg_gs address space, there exists no (known) offset that
would define it as a subspace of the generic space. It is gs: prefix
that results in segment override that "switches" to __seg_gs address
space. So, to convert the pointer from __seg_gs to (nonsensical!)
generic address space, GCC allows explicit (void *)(uintptr_t) cast
that in effect just strips gs: prefix from the address. You can then
use the pointer as a pointer to a generic space, but you can't use it
to dereference data from __seg_gs address space - this would be
nonsensical, so (__seg_gs void *)(uintptr_t) cast is needed to convert
pointer back to __seg_gs AS.

> As suspected, gcc is producing garbage code.

Nope, this is expected and well documented behavior.

> See:
> https://godbolt.org/z/ozozYY3nv
>
> For
> void *ptr = (void *)(long)&pcpu_hot;
>
> gcc emits
> .quad pcpu_hot
> which is nonsensical, while clang refuses to produce garbage
> and dumps stack.
>
> Sadly, both compilers produce garbage for ret_addr()

No, they are correct. The pointer is explicitly cast to generic
address space and this is what you get.

> and both compilers produce correct code for ret_value().
> At least something.
>
> Uros,
> your percpu code is broken.
> you shouldn't rely on gcc producing garbage.
> Sooner or later gcc will start erroring on it just as clang.

It won't. It is well documented behavior, as documented in [1].
Regarding linux code, you "should not" pass a pointer to generic
address space to dereference percpu data. Currently,
__verify_percpu_ptr() only triggers a warning when sparse checking is
used, but my patchset will now enforce this as a compile-time error
(this was a much sought feature, and it was possible to implement only
recently by using the newly introduced typeof_unqual() operator). Rest
assured, before enabling this feature in linux, plenty of people
unsuccessfully tried to poke a hole in this functionality and long
threads are archived where address space functionality was discussed
to death. ;)

BTW: You can use:

--cut here--
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 474d648bca9a..e6a7525c9db9 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -105,6 +105,10 @@
 # define __my_cpu_type(var)    typeof(var) __percpu_seg_override
 # define __my_cpu_ptr(ptr)    (__my_cpu_type(*(ptr))*)(__force uintptr_t)(ptr)
 # define __my_cpu_var(var)    (*__my_cpu_ptr(&(var)))
+
+# if __has_attribute(address_space) && defined(USE_TYPEOF_UNQUAL)
+#  define __percpu_qual        __attribute__((address_space(3)))
+# endif
 #endif

 #define __percpu_arg(x)        __percpu_prefix "%" #x
--cut here--

to also see clang's address space checks in action on x86 even without
working __seg_gs support (You will need to disable
Wduplicate-decl-specifier warning).

HTH,
Uros.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux