Hello Sir/Ma'am, I am Aishwarya Raamkumar, an Embedded Systems Masters student in TU/e. I am doing my masters project in collaboration with IMEC, HTC, Eindhoven. For my project, I am using a board provided to me by IMEC, which contains a chip manufactured designed in-house. This chip has a matrix processor apart from an ARM cortex m4. A part of my assignment is to detect matrix multiplication or vector-matrix operations and replace it with a set of register load-store values operations. This is to make the matrix processor do the matrix operation in place of the arm cortex m4. The code is in C language. Currently, I am using a gnu-none-eabi cross compiler for arm. I wish to know if you could help me by providing insights to do the following. If I have a code with matrix multiplication as shown below, it should replace the matrix multiplication with some specific load operations as shown below. How can I do it with GCC? Could you kindly provide your insights? for(m = 0; m < p; m++) { for(n = 0; n < s; n++) { res[m][n] = 0; for(o = 0; o < q; o++) { res[m][n] += a[m][o] * b[o][n]; } } } is to be replaced with MATRIX->MATRIX.OPERATION = MVC_OPERATION_MUL; MATRIX->INSTRUCTION_b.ACCUMULATE = 1; MATRIX->INSTRUCTION_b.SHIFT = 0; MATRIX->INSTRUCTION_b.WORDWIDTH = 0; // 32bit MATRIX->COUNT_1 = c2; MATRIX->COUNT_2 = r1; MATRIX->COUNT_3 = c1; MATRIX->A_SIZE_INC_1_b.A_SIZE_1 = 1; MATRIX->A_SIZE_INC_2_b.A_SIZE_2 = r1; MATRIX->A_SIZE_INC_3_b.A_SIZE_3 = c1; MATRIX->A_SIZE_INC_1_b.A_INC_1 = 0; MATRIX->A_SIZE_INC_2_b.A_INC_2 = 4*c1; // next 'row' is the same data shifted by 1 sample position MATRIX->A_SIZE_INC_3_b.A_INC_3 = 4; . . . MATRIX->R_SIZE_INC_2_b.R_INC_2 = 4; MATRIX->R_SIZE_INC_3_b.R_INC_3 = 0; MATRIX->B_PTR = b; MATRIX->A_PTR = a; MATRIX->R_PTR = res; Could you please give me an insight if this is possible? And if so, how do I proceed? Kind Regards Aishwarya