Hello, thank you for considering the patches > > * I have no runtime comparison for the orc svolume code yet (note that > > orc is not used on ARM yet, although it should be possible) > The ARM version of the svolume code makes use of 'smulwb' instruction, > making it faster than the Orc code since that's a decomposition of this > instruction. I tried to get the Orc svolume to work; this is not possible for the moment since Orc lacks support for loadpq on ARM NEON -- so no comparison I have been in contact with David Schleef but no progress there I failed to implement the missing support myself in a reasonable time > On the svolume side, the implementations are initialised in increasing > order of optimisation, so if you have the tests enabled for all of them, > you'll get the runtime numbers of each with the previous implementation > as reference. ok, will try to make use of that p. -- Peter Meerwald +43-664-2444418 (mobile)