On Thu, Jan 16, 2020 at 03:29:57PM +0000, Raamkumar, A. wrote: > Hello Sir/Ma'am, > > I am Aishwarya Raamkumar, an Embedded Systems Masters student in TU/e. I am doing my masters project in collaboration with IMEC, HTC, Eindhoven. For my project, I am using a board provided to me by IMEC, which contains a chip manufactured designed in-house. This chip has a matrix processor apart from an ARM cortex m4. A part of my assignment is to detect matrix multiplication or vector-matrix operations and replace it with a set of register load-store values operations. This is to make the matrix processor do the matrix operation in place of the arm cortex m4. The code is in C language. Currently, I am using a gnu-none-eabi cross compiler for arm. I wish to know if you could help me by providing insights to do the following. If I have a code with matrix multiplication as shown below, it should replace the matrix multiplication with some specific load operations as shown below. How can I do it with GCC? Could you kindly provide your insights? > > > for(m = 0; m < p; m++) > { > for(n = 0; n < s; n++) > { > res[m][n] = 0; > for(o = 0; o < q; o++) > { > > res[m][n] += a[m][o] * b[o][n]; > } > } > } > > is to be replaced with > > MATRIX->MATRIX.OPERATION = MVC_OPERATION_MUL; > MATRIX->INSTRUCTION_b.ACCUMULATE = 1; > MATRIX->INSTRUCTION_b.SHIFT = 0; > MATRIX->INSTRUCTION_b.WORDWIDTH = 0; // 32bit > > MATRIX->COUNT_1 = c2; > MATRIX->COUNT_2 = r1; > MATRIX->COUNT_3 = c1; > > MATRIX->A_SIZE_INC_1_b.A_SIZE_1 = 1; > MATRIX->A_SIZE_INC_2_b.A_SIZE_2 = r1; > MATRIX->A_SIZE_INC_3_b.A_SIZE_3 = c1; > MATRIX->A_SIZE_INC_1_b.A_INC_1 = 0; > MATRIX->A_SIZE_INC_2_b.A_INC_2 = 4*c1; // next 'row' is the same data shifted by 1 sample position > MATRIX->A_SIZE_INC_3_b.A_INC_3 = 4; > > . > . > . > > > MATRIX->R_SIZE_INC_2_b.R_INC_2 = 4; > MATRIX->R_SIZE_INC_3_b.R_INC_3 = 0; > MATRIX->B_PTR = b; > MATRIX->A_PTR = a; > MATRIX->R_PTR = res; > > Could you please give me an insight if this is possible? And if so, how do I proceed? Hi Aishwarya A quick internet search for ARM and BLAS (Basic Linear Algebra Subroutines) gives https://developer.arm.com/tools-and-software/server-and-hpc/compile/arm-compiler-for-linux/arm-performance-libraries as the first result. I have not used this library, nor know whether it is compatible with gcc (but guess that it could well be). The specific function you want would probably be DGEMM (Double general matrix multiplication). > Aishwarya Bob -- "We are in the beginning of a mass extinction, and all you can talk about is money and fairy tales of eternal economic growth." - Greta Thunberg