How to use the KNC Vectorregisters with GCC? Race condition with ICC & KNC?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

i am new to the gcc mailinglist, so i hope i am right here.

As the subject shows, i work with KNC. My problem is, that i have developed a kernel modul for a NIC and now want to use the 512Bit registers of KNC for some memcopy jobs.

I have experience how to use the GCC to compile der KNC-linux and kernel moduls. So no problem at the moment. Everything works fine.

Before i started to write inline assembler with the 512Bit registers, i have written some minimal examples.

On a normal i5-3470 everything works fine together with the gcc. Also on KNC everything works. The problem now is, that when i try to use the 512Bit registers, it looks like GCC doesn't know the register names and instructions.

To solve the problem with the instructions i think is no problem, because i have the instruction manual, but i have no idea how to solve the register problem.

So i try to use the ICC with -mmic. The source compiles, but when i measure the clock cycles with rdtsc, the two first check work, but the 3. and 4. not. I tried to solve the problem with the gdb, but when i use -g the mistake no longer occur. Also when i use a printf, sleep(1) or usleep(1), the problem is fixed. So i think there is a race condition with the write of the value into the memory, because 1 or even 100 nops have no effects.

My inline assembler knowledge is rudimental, so i don't know if i have some problems with the use of clobber registers and so on or if there is a bug in gcc or icc.

That the -g with the icc solve the problem makes it impossible for me to debug the problem. So i hope somebody is able to help me.

My favourite is to use gcc together with the 512Bit registers, if there is a bug in my inline assembler, a solution/hint would be also fine.

So there is my code:

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>


int rdtsc_count(void){
int count;
__asm__ __volatile__(   "rdtsc;                 \n\t"
                        "movl   %%eax, %0;      \n\t"
                         :"=m"(count)//, "=r"(brd), "=r"(crd), "=r"(drd)
                         :
                         :"%eax", "memory"//, "cc"//, "%ebx", "%ecx"
                        );

return count;
}


int main(int argc, char *argv[]){

int starta=0, startb=0, stopa=0, stopb=0;
int buffer_size=32;
uint64_t* buffer;
uint32_t buflen=atoi(argv[1]);


/////////////setup
buffer = (uint64_t*) malloc (buffer_size*sizeof(uint64_t));
packet_buffer = (uint64_t*) malloc (buffer_size*sizeof(uint64_t));
packet_buffer_ref= (uint64_t*) malloc (buffer_size*sizeof(uint64_t));//REF

waddr=0;

//printf("Adresse von packet_buffer %x", waddr);
printf("Orginaldaten\n");
for(i=0; i<buffer_size; i++){
        buffer[i]=i+i*i;
        packet_buffer[i]=0;
        packet_buffer_ref[i]=0;
        printf("%x\t", buffer[i]);
};
printf("\n");

printf("packet_buffer start\n");
for(i=0; i<buffer_size; i++){
        printf("%x\t", packet_buffer[i]);
};
printf("\n");

////////////end_setup

if(buflen==0 | buflen>120){
        printf("buflen too big or too small\n");
        return 0;
}


########################################
starta=rdtsc_count();
memcpy(&(packet_buffer_ref[waddr+1]), buffer, sizeof(uint64_t)*(buflen));//REF
stopa=rdtsc_count();
printf("memcpy took\t%d\tclocks\n", stopa-starta);
########################################
##Here everything is fine
########################################

########################################
startb=rdtsc_count();
__asm__ (             "movq   %1,             %%rsi;          \n\t"
                        "movq   %0,             %%rdi;          \n\t"
                        "movl   %2,             %%ecx;          \n\t"
                        "addq   $8,             %%rdi;          \n\t"
//                      "shl    $3,             %%ecx;          \n\t"
        "Schleife:       movsq;                                 \n\t"
                        "loop Schleife;                         \n\t"
                        :"=m"(packet_buffer)
                        :"r"(buffer), "r"(buflen)
                        :"%rsi", "%rdi", "%rcx", "memory"
                        );

stopb=rdtsc_count();

######################################### If i use one of this functions, everything is fine.
//usleep(1);
//printf("stopa %d\n", stopa);
//printf("fdsagfa\n");
#########################################
printf("asm movsq took\t%d\tclocks\n", stopb-startb);

########################################
##Here i have the problem. It looks like stopb or startb is still 0, when i use no function between the output and the rdtsc_count()
########################################






[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux