Hi! I am experiencing a problem on ARM with unaligned access where I think it should not happen. I could track this down to the attached example. I compile the code using the following command: arm-linux-gnueabihf-gcc -Wall -Os -g -Wa,-mthumb -mthumb-interwork -mno-unaligned-access -mhard-float -march=armv7-a -static -o gcc_example gcc_example.c The problem is in net_set_ip_header function. The relevant produced assembler output is this: 1044c: b513 push {r0, r1, r4, lr} 1044e: 4604 mov r4, r0 10450: 9200 str r2, [sp, #0] 10452: 300c adds r0, #12 10454: 4a11 ldr r2, [pc, #68] ; (1049c <net_set_ip_header+0x50>) 10456: 4b10 ldr r3, [pc, #64] ; (10498 <net_set_ip_header+0x4c>) 10458: 447a add r2, pc 1045a: 9101 str r1, [sp, #4] 1045c: f840 3c0c str.w r3, [r0, #-12] ... 10498: 14000045 .word 0x14000045 1049c: 00060a48 .word 0x00060a48 The compiler is smart and optimizes setting the constants to the struct ip_udp_hdr fields ip_hl_v ip_tos and ip_len to one single 32bit (word) access. It stores the three constants pc relative in a word. This gets loaded to r3 and then stored to the structure in memory (str.w instruction). Since I compile with -mno-unaligned-access and the relevant structure members are 8 / 16bit and even the function argument uint8_t *pkt pointer tell the compiler that these are not neccessarily 32bit aligned, I think it should not do this. The relevant code crashes on my cpu depending on the value of the pkt pointer when calling the function of course. The pkt += ETH_HDR_SIZE; line can be used to modify the pointer if needed. If I compile with -O0 or -O1 it does not happen. It does also not happen when commenting out the asm("nop"); as a optimization barrier. I tested this with gcc 8.3.1, 9.3.0 and 10.1.0. They all behave more or less the same in this regard. What am I missing ? What am I doing wrong ? Thank you, Lars
#include <arpa/inet.h> #include <string.h> #include <sys/types.h> #define PKTBUFSRX 4 #define PKTSIZE_ALIGN 1536 #define PKTALIGN 16 #define ETH_HDR_SIZE 14 #define IP_HDR_SIZE 20 #define IP_FLAGS_DFRAG 0x4000 static uint8_t net_pkt_buf[(PKTBUFSRX+1) * PKTSIZE_ALIGN + PKTALIGN]; static unsigned net_ip_id; /* * Internet Protocol (IP) + UDP header. */ struct ip_udp_hdr { uint8_t ip_hl_v; /* header length and version */ uint8_t ip_tos; /* type of service */ uint16_t ip_len; /* total length */ uint16_t ip_id; /* identification */ uint16_t ip_off; /* fragment offset field */ uint8_t ip_ttl; /* time to live */ uint8_t ip_p; /* protocol */ uint16_t ip_sum; /* checksum */ struct in_addr ip_src; /* Source IP address */ struct in_addr ip_dst; /* Destination IP address */ }; void net_set_ip_header(uint8_t *pkt, struct in_addr dest, struct in_addr source) { struct ip_udp_hdr *ip = (struct ip_udp_hdr *)pkt; /* * Construct an IP header. */ ip->ip_hl_v = 0x45; //asm("nop"); ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); ip->ip_off = htons(IP_FLAGS_DFRAG); /* Don't fragment */ ip->ip_ttl = 255; ip->ip_sum = 0; /* already in network byte order */ memcpy((void *)&ip->ip_src, &source, sizeof(struct in_addr)); /* already in network byte order */ memcpy((void *)&ip->ip_dst, &dest, sizeof(struct in_addr)); } int main(int argc, char** argv) { uint8_t *pkt; struct in_addr bcast_ip, src_ip; bcast_ip.s_addr = 0xFFFFFFFFL; src_ip.s_addr = 0x0L; pkt = net_pkt_buf; memset((void *)pkt, 0, sizeof(net_pkt_buf)); pkt += ETH_HDR_SIZE; net_set_ip_header(pkt, bcast_ip, src_ip); return 1; }