Re: GCC -msse2 portability question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 23/03/2014 23:34, Laurent GUERBY wrote:
> On Sun, 2014-03-23 at 20:50 +0100, Loic Dachary wrote:
>> Hi Laurent,
>>
>> In the context of optimizing erasure code functions implemented by
>> Kevin Greenan (cc'ed) and James Plank at
>> https://bitbucket.org/jimplank/gf-complete/ we ran accross a question
>> you may have the answer to: can gcc -msse2 (or -msse* for that matter
>> ) have a negative impact on the portability of the compiled binary
>> code ? 
>>
>> In other words, if a code is compiled without -msse* and runs fine on
>> all intel processors it targets, could it be that adding -msse* to the
>> compilation of the same source code generate a binary that would fail
>> on some processors ? This is assuming no sse specific functions were
>> used in the source code.
>>
>> In gf-complete, all sse specific instructions are carefully protected
>> to not be run on a CPU that does not support them. The runtime
>> detection is done by checking CPU id bits ( see
>> https://bitbucket.org/jimplank/gf-complete/pull-request/7/probe-intel-sse-features-at-runtime/diff#Lsrc/gf_intel.cT28 )
>>
>> The corresponding thread is at:
>>
>> https://bitbucket.org/jimplank/gf-complete/pull-request/4/defer-the-decision-to-use-a-given-sse/diff#comment-1479296
>>
>> Cheers
>>
> 
> Hi Loic,
> 
> The GCC documentation is here with lists of architecture supporting
> sse/sse2:
> 
> http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/i386-and-x86-64-Options.html#i386-and-x86-64-Options
> 
> So unless you want to run your code a very very old x86 32 bit processor
> "-msse" shouldn't be an issue. "-msse2" is similar.

This is good to know :) Should I be worried about unintended side effects of -msse4.2 -mssse3 -msse4.1 or -mpclmul ? These are the flags that gf-complete are using, specifically.

Cheers

> 
> -mtune=xxx with xxx being a recent arch could be interesting for you
> because it keeps compatibility with the generic arch while tuning
> resulting code on the specific arch (for example the current fashionable
> arch like corei7).
> 
> For alibrary you can choose the code you execute a load/run time
> for a specific function by using the STT_GNU_IFUNC feature :
> 
> http://vger.kernel.org/~davem/cgi-bin/blog.cgi/2010/02/07
> http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html#index-g_t_0040code_007bifunc_007d-attribute-2529
> 
> I believe recent GLIBC use this feature to tune
> some performance/arch sensitive functions.
> 
> Sincerely,
> 
> Laurent
> 
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux