Re: [PATCH] kbuild: treat char as always signed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 21, 2022 at 03:46:01PM -0700, Linus Torvalds wrote:
> On Thu, Oct 20, 2022 at 3:41 AM Gabriel Paubert <paubert@xxxxxxx> wrote:
> >
> > I must miss something, the strcmp man page says:
> >
> > "The comparison is done using unsigned characters."
> 
> You're not missing anything, I just hadn't looked at strcmp() in forever.
> 
> Yeah, strcmp clearly doesn't care about the signedness of 'char', and
> arguably an unsigned char argument makes more sense considering the
> semantics of the funmction.
> 
> > But it's not for this that I wrote this message. Has anybody considered
> > using transparent unions?
> 
> I don't love the transparent union-as-argument syntax, but you're
> right, that would fix the warning.

I'm not in love with the syntax either.

> 
> Except it then doesn't actually *work* very well.
> 
> Try this:
> 
>         #include <sys/types.h>
> 
>         #if USE_UNION
>         typedef union {
>                 const char *a;
>                 const signed char *b;
>                 const unsigned char *c;
>         } conststring_arg __attribute__ ((__transparent_union__));
>         size_t strlen(conststring_arg);
>         #else
>         size_t strlen(const char *);
>         #endif
> 
>         int test(char *a, unsigned char *b)
>         {
>                 return strlen(a)+strlen(b);
>         }
> 
>         int test2(void)
>         {
>                 return strlen("hello");
>         }
> 
> and now compile it both ways with
> 
>         gcc -DUSE_UNION -Wall -O2 -S t.c
>         gcc -Wall -O2 -S t.c
> 

Ok, I´ve just tried it, except that I had something slightly different in
mind, but perhaps should have been clearer in my first post.

I have change your code to the following:


#include <sys/types.h>

#if USE_UNION
typedef union {
	const char *a;
	const signed char *b;
	const unsigned char *c;
} conststring_arg __attribute__ ((__transparent_union__));
static inline size_t strlen(conststring_arg p)
{
	return __builtin_strlen(p.a);
}
#else
size_t strlen(const char *);
#endif

int test(char *a, unsigned char *b)
{
	return strlen(a)+strlen(b);
}

int test2(void)
{
	return strlen("hello");
}

> and notice how yes, the "-DUSE_UNION" one silences the warning about
> using 'unsigned char *' for strlen. So it seems to work fine.
> 
> But then look at the code it generates for 'test2()" in the two cases.

Now test2 looks properly optimized.

This is a bit exploiting a compiler loophole, it calls an external
function which has been defined with the same name!

Depending on how you look at it, it's either disgusting or clever.

I don´t have clang installed, so I don't know whether it would swallow
this code or react with a strong allergy.

	Gabriel
> 
> The transparent union version actually generates a function call to an
> external 'strlen()' function.
> 
> The regular version uses the compiler builtin, and just compiles
> test2() to return the constant value 5.
> 
> So playing games with anonymous union arguments ends up also disabling
> all the compiler optimizations we do want, becaue apparently gcc then
> decides "ok, I'm not going to warn about you declaring this
> differently, but I'm also not going to use the regular one because you
> declared it differently".
> 
> This, btw, is also the reason why we don't use --freestanding in the
> kernel. We do want the basic <string.h> things to just DTRT.
> 
> For the sockaddr_in games, the above isn't an issue. For strlen() and
> friends, it very much is.
> 
>                        Linus


 




[Index of Archives]     [Linux&nblp;USB Development]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite Secrets]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux