Please reply to me or CC me in addition to the list as I am not subscribed. I have code essentially like this: int32_t var1 = <value>, var2 = <value>; int64_t acc = <value>; acc += (int64_t)((int16_t)(var1 & 0xffff) * (int16_t)((var2 & 0xffff0000)>>16)) which is essentially the same as this test case: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/testsuite/gcc.target/arm/smlaltt-1.c;hb=HEAD but instead of generating a SMLALBT or SMLALTB instruction, it generates a ASRS then a SMLALBB. This is especially problematic for me because it ends up using an extra register which causes extra loads/stores in a tight loop. Is there something I'm missing to get it to generate SMLALxy instructions other than SMLALBB Here is the exact code in question: https://gist.github.com/ajeddeloh/734a2d7e44df219d38d327b5cf82f504 I've stripped most of the unrelated code out. The exact code in question is in the loop. Thanks for your help! - Andrew