Re: GCC -msse2 portability question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Loic,

If you're already doing a runtime checking of these bits before
calling the functions you want to optimize then you can use the gcc
Function Specific Opt feature of GCC.
http://gcc.gnu.org/onlinedocs/gcc-4.4.0/gcc/Function-Attributes.html#index-g_t_0040code_007btarget_007d-function-attribute-2259

Basically you add a target attribute to a function (specifying use SSE version).

void my_optimized_function(void* sse_vec, size_t n)
    __attribute__ ((__target__ ("sse4.2")));

It's available from GCC 4.4 and on. That happens to be the GCC version
on RHEL6, Debian Squeeze, Ubuntu 10.04 LTS. Hopefully that's good
enough and you can omit the optimization on people on platforms older
than that.

Best,
- Milosz


On Tue, Mar 25, 2014 at 7:22 AM, Laurent GUERBY <laurent@xxxxxxxxxx> wrote:
> On Tue, 2014-03-25 at 10:56 +0100, Loic Dachary wrote:
>> Hi Laurent,
>
> Hi Loic,
>
>> It occurs to me that all we're after is to enable SSE functions such as _mm_set_epi32. We're not trying to have the binary optimized in any implicit way, it is all explicit. The problem seems to be that -msse4.2 will do both
>>
>> * activate _mm_set_epi32 etc functions
>> * optimize the binary to use sse4.2 instructions
>>
>> Do you know of a compiler flag that would only
>>
>> * activate _mm_set_epi32 etc functions
>
> This is a function part of an Intel defined standard to access processor
> feature, this standard will have one or more implementation depending on
> your compiler/libc/OS. IIRC these functions are closely aligned with
> specific processor feature, if the feature isn't there in general it
> makes no sense to use them.
>
> In the particular case of  _mm_set_epi32 it seems
> to be a data formating inline function:
>
> /usr/lib/gcc/x86_64-linux-gnu/4.7.2/include/emmintrin.h
> ...
> typedef long long __m128i __attribute__ ((__vector_size__ (16),
> __may_alias__));
> ...
> extern __inline __m128i __attribute__((__gnu_inline__,
> __always_inline__, __artificial__))
> _mm_set_epi32 (int __q3, int __q2, int __q1, int __q0)
> {
>   return __extension__ (__m128i)(__v4si){ __q0, __q1, __q2, __q3 };
> }
>
> Functions in this include files are using GCC builtins:
>
> http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/X86-Built-in-Functions.html#X86-Built-in-Functions
>
> To avoid any issue I wouldn't use these functions at all
> on a non SSE machine.
>
> Sincerely,
>
> Laurent
>
>> and not
>>
>> * optimize the binary to use sse4.2 instructions
>>
>> ? It may be a RTFM question and I apologize for that. Reading http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/i386-and-x86-64-Options.html#i386-and-x86-64-Options it looks like this is more or less what --mtune=corei7-avx would do (because gf-complete uses PCLMUL when available). But it feels weird to specify a specific processor model where what we need is a set of features.
>>
>> Thanks for your help :-)
>>
>> On 25/03/2014 10:43, Laurent GUERBY wrote:
>> > On Mon, 2014-03-24 at 22:27 +0100, Loic Dachary wrote:
>> >>
>> >> On 23/03/2014 23:34, Laurent GUERBY wrote:
>> >>> http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/i386-and-x86-64-Options.html#i386-and-x86-64-Options
>> >>>
>> >>> So unless you want to run your code a very very old x86 32 bit processor
>> >>> "-msse" shouldn't be an issue. "-msse2" is similar.
>> >>
>> >> This is good to know :) Should I be worried about unintended side effects of -msse4.2 -mssse3 -msse4.1 or -mpclmul ? These are the flags that gf-complete are using, specifically.
>> >
>> > Hi,
>> >
>> > SSE4.2 will be available only in more recent
>> > processors as documented on the page above.
>> >
>> > If your library already is dynamically checking for processor
>> > feature I would advise to be conservative in your
>> > -m flags, ie using what debian would use for maximum
>> > x86 portability.
>> >
>> > Sincerely,
>> >
>> > Laurent
>> >
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: milosz@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux