RE: arm_neon.h / vext_u64 (uint64x1_t __a, uint64x1_t __b, __const int __c)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jochen,

> 
> according to the manual, for the lower 8 Byte of the 16 byte arm simd
> register c is range 0..7,
> 
> for the complete 16 byte of arm simd register c is range 0..15;
> 

Yes, and the point is that index #0 is useless on an x1_t type. See your own example below.

> example:
> 
> #include <stdio.h>
> #include <arm_neon.h>
> 
> uint64x1_t ext(uint64x1_t a, uint64x1_t b, int c) {
>    uint64x1_t result;
>    switch(c) {
>      case 0: asm("ext %0.8B, %1.8B, %2.8B, #0" : "=w" (result) : "w"
> (a), "w" (b)); break;
>      case 1: asm("ext %0.8B, %1.8B, %2.8B, #1" : "=w" (result) : "w"
> (a), "w" (b)); break;
>      case 2: asm("ext %0.8B, %1.8B, %2.8B, #2" : "=w" (result) : "w"
> (a), "w" (b)); break;
>      case 3: asm("ext %0.8B, %1.8B, %2.8B, #3" : "=w" (result) : "w"
> (a), "w" (b)); break;
>      case 4: asm("ext %0.8B, %1.8B, %2.8B, #4" : "=w" (result) : "w"
> (a), "w" (b)); break;
>      case 5: asm("ext %0.8B, %1.8B, %2.8B, #5" : "=w" (result) : "w"
> (a), "w" (b)); break;
>      case 6: asm("ext %0.8B, %1.8B, %2.8B, #6" : "=w" (result) : "w"
> (a), "w" (b)); break;
>      case 7: asm("ext %0.8B, %1.8B, %2.8B, #7" : "=w" (result) : "w"
> (a), "w" (b)); break;
>    }
>    return result;
> }
> 
> int main(int argc, char **argv) {
>    uint64x1_t a, b, result;
>    a[0]=0x0011223344556677;
>    b[0]=0x8899aabbccddeeff;
>    for(int c=0; c<8; c++) {
>      result=ext(a, b, c);
>      printf("%d %016lx\n", c, result[0]);
>    }
>    return 0;
> }
> 
> output:
> 
> 0 0011223344556677

For index 0 you have the same number back as was in a[0].
There is no point in the compiler emitting an instruction to get the same number back that it had as the input.

Regards,
Tamar

> 1 ff00112233445566
> 2 eeff001122334455
> 3 ddeeff0011223344
> 4 ccddeeff00112233
> 5 bbccddeeff001122
> 6 aabbccddeeff0011
> 7 99aabbccddeeff00
> 
> Kind regards, Jochen
> 
> Am 09.09.20 um 11:17 schrieb Tamar Christina:
> > Hi Jochen,
> >
> > EXT is a byte level extract, if you have a 64 bit vector and a 64-bit
> > type like uint64x1_t then the only possible index for n is 0.
> >
> > While the compiler could have emitted
> >
> > ext     v0.8b, v0.8b, v1.8b, #0
> >
> > this is pointless as this essentially means to return v0.
> >
> > As such the compiler just uses return __a; as there's no point in emitting an
> instruction.
> >
> > Regards,
> > Tamar
> >
> >> -----Original Message-----
> >> From: Gcc-help <gcc-help-bounces@xxxxxxxxxxx> On Behalf Of Jochen
> >> Barth via Gcc-help
> >> Sent: Wednesday, September 2, 2020 10:10 PM
> >> To: gcc-help@xxxxxxxxxxx
> >> Subject: arm_neon.h / vext_u64 (uint64x1_t __a, uint64x1_t __b,
> >> __const int __c)
> >>
> >> Dear reader,
> >>
> >> the definition of aarch64/arm_neon.h (gcc 10.2) is
> >>
> >> __extension__ extern __inline uint64x1_t __attribute__
> >> ((__always_inline__, __gnu_inline__, __artificial__))
> >> vext_u64 (uint64x1_t __a, uint64x1_t __b, __const int __c) {
> >>     __AARCH64_LANE_CHECK (__a, __c);
> >>     /* The only possible index to the assembler instruction returns
> >> element 0.  */
> >>     return __a;
> >> }
> >>
> >> So this function does essentially »return __a«.
> >>
> >> If the function name »vext_...« has, as the name suggests, something
> >> to do with the »ext« neon simd instruction,
> >>
> >> then I do not understand where the asm-equivalent »ext« neon
> >> instrinct is, because in the »Arm Architecture Reference Manual«,
> >> chapter C7.2.543
> >> states: »<index> Is the lowest numbered byte element to be
> >> extracted...«, ranging from 0..7 for Q=8 and 0..15 for Q=16
> >> (extraction over the whole 128 bit register).
> >>
> >> PS: gcc with vector expressions does not (?) use »ext« for
> >> y=(x<<(c*8))
> >> | (x>>(64-c*8)); // for Q=8
> >>
> >> Kind regards, Jochen
> > IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended recipient,
> please notify the sender immediately and do not disclose the contents to any
> other person, use it for any purpose, or store or copy the information in any
> medium. Thank you.




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux