Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 14 Feb 2007 21:42:26 +0000 Ralf Baechle <ralf@xxxxxxxxxxxxxx> wrote:

> Time for a little bit of dead horse flogging.
> 
> On Mon, Mar 06, 2006 at 05:05:52PM -0800, Andrew Morton wrote:
> 
> > > --- a/include/asm-generic/unaligned.h
> > > +++ b/include/asm-generic/unaligned.h
> > > @@ -78,7 +78,7 @@ static inline void __ustw(__u16 val, __u
> > >  
> > >  #define __get_unaligned(ptr, size) ({		\
> > >  	const void *__gu_p = ptr;		\
> > > -	__typeof__(*(ptr)) val;			\
> > > +	__u64 val;				\
> > >  	switch (size) {				\
> > >  	case 1:					\
> > >  		val = *(const __u8 *)__gu_p;	\
> > > @@ -95,7 +95,7 @@ static inline void __ustw(__u16 val, __u
> > >  	default:				\
> > >  		bad_unaligned_access_length();	\
> > >  	};					\
> > > -	val;					\
> > > +	(__typeof__(*(ptr)))val;		\
> > >  })
> > >  
> > >  #define __put_unaligned(val, ptr, size)		\
> > 
> > I worry about what impact that change might have on code generation. 
> > Hopefully none, if gcc is good enough.
> > 
> > But I cannot think of a better fix.
> 
> It does inflate the code but back then we agreed to go for Atsushi's patch
> because it was fairly obviously correct.  This patch obviously is less
> obvious but generates fairly decent, works for arbitrary data types and
> cuts down the size of unaligned.h from 122 lines to 44 so it must be good.
> 
> ...
>
> +#define get_unaligned(ptr)						\
> +({									\
> +	const struct {							\
> +		union {							\
> +			const int __un_foo[0];				\
> +			const __typeof__(*(ptr)) __un;			\
> +		} __un __attribute__ ((packed));			\
> +	} * const __gu_p = (void *) (ptr);				\
> +									\
> +	__gu_p->__un.__un;						\
>  })

Can someone please tell us how this magic works?  (And it does appear to
work).

It seems to assuming that the compiler will assume that members of packed
structures can have arbitrary alignment, even if that alignment is obvious.

Which makes sense, but I'd like to see chapter-and-verse from the spec or
from the gcc docs so we can rely upon it working on all architectures and
compilers from now until ever more.

IOW: your changlogging sucks ;)


[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux