On Thu, Jun 13, 2013 at 6:55 PM, Chung-Ju Wu <jasonwucj@xxxxxxxxx> wrote: > 2013/6/13 Chung-Ju Wu <jasonwucj@xxxxxxxxx>: >> 2013/6/13 Terry Guo <flameroc@xxxxxxxxx>: >>> Hi there, >>> >>> For the below simple case, I think poly1[i] should be hoisted to >>> outermost loop to avoid loading from innermost loop at each iteration. >>> But for arm-none-eabi target like cortex-m4, gcc fails to do so. Is >>> this a normal case or a missing optimization? Please advise. >>> >>> void >>> PolyMul (float *poly1, unsigned int n1, >>> float *poly2, unsigned int n2, >>> float *polymul, unsigned int *nmul) >>> { >>> unsigned int i, j; >>> >>> for (i = 0; i <= n1; i++) >>> for (j = 0; j <= n2; j++) >>> polymul[i+j] += poly1[i] * poly2[j]; >>> } >>> >>> Thanks. >>> >>> BR, >>> Terry >> >> Unless you use 'restrict' qualifier to the pointer type, >> I don't think compiler can do such optimization due to aliasing issue. >> > > Also I don't think -fstrict-aliasing could help for your case. > Becase poly1, poly2, and polymul are all 'float *' pointer type. > Thanks for your help. You are correct. The 'restrict' qualifier can give me what I want. With following changes, the poly1[i] can be hoisted into outermost loop: void PolyMul (float *__restrict__ poly1, unsigned int n1, float *__restrict__ poly2, unsigned int n2, float *__restrict__ polymul, unsigned int *nmul) The -fstrict-aliasing doesn't help for either 'int *' or 'float *' pointer type. BR, Terry