It gets a bit more tricky with x86_64 since the arch dictates that the base line has SSE2 (but not necessarily later). I would do is both support SSE2 (maybe in core without dlopen) and then support all the others in a SSE4 version (including SSE4_PCMUL). I'm glossing over x86-32 here, but you could something similar. Best - Milosz On Tue, Mar 25, 2014 at 3:21 PM, Loic Dachary <loic@xxxxxxxxxxx> wrote: > > > On 25/03/2014 20:13, Kevin Greenan wrote: >> +1 >> >> Yeah, that sounds better... Let's keep this as simple as possible. > > I'll rework the https://bitbucket.org/jimplank/gf-complete/pull-request/4/defer-the-decision-to-use-a-given-sse accordingly. > > Would it be sensible to compile with SSE optimizations only if all are available ( SSE2, SSSE3, SSE4, SSE4_PCMUL ) and not attempt to distinguish betweel SSSE3 being available but not SSE4_PCMUL etc. From what I understand at this point that kind of distinction is going to be difficult to manage anyway. > > Is it too simplistic ? > >> >> -kevin >> >> >> On Tue, Mar 25, 2014 at 12:08 PM, Loic Dachary <loic@xxxxxxxxxxx <mailto:loic@xxxxxxxxxxx>> wrote: >> >> Andreas Peters suggested another approach, which makes sense to me : have one plugin with SSE optimizations enabled, another without them and chose at runtime between the two. >> >> What do you think ? >> >> On 23/03/2014 20:50, Loic Dachary wrote: >> > Hi Laurent, >> > >> > In the context of optimizing erasure code functions implemented by Kevin Greenan (cc'ed) and James Plank at https://bitbucket.org/jimplank/gf-complete/ we ran accross a question you may have the answer to: can gcc -msse2 (or -msse* for that matter ) have a negative impact on the portability of the compiled binary code ? >> > >> > In other words, if a code is compiled without -msse* and runs fine on all intel processors it targets, could it be that adding -msse* to the compilation of the same source code generate a binary that would fail on some processors ? This is assuming no sse specific functions were used in the source code. >> > >> > In gf-complete, all sse specific instructions are carefully protected to not be run on a CPU that does not support them. The runtime detection is done by checking CPU id bits ( see https://bitbucket.org/jimplank/gf-complete/pull-request/7/probe-intel-sse-features-at-runtime/diff#Lsrc/gf_intel.cT28 ) >> > >> > The corresponding thread is at: >> > >> > https://bitbucket.org/jimplank/gf-complete/pull-request/4/defer-the-decision-to-use-a-given-sse/diff#comment-1479296 >> > >> > Cheers >> > >> >> -- >> Loïc Dachary, Artisan Logiciel Libre >> >> > > -- > Loïc Dachary, Artisan Logiciel Libre > -- Milosz Tanski CTO 10 East 53rd Street, 37th floor New York, NY 10022 p: 646-253-9055 e: milosz@xxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html