Hi Hao, > I used to run into the same issue around CRT code on x86. Use of > `-ffreestanding` disables a number of optimizations, for example, the > compiler cannot optimize > > int data[4]; > memset(&data, 0, sizeof(data)); > > to a series of store operations, but leave it as a function call, which > is rather overkill. With an array initialiser GCC (having -ffreestanding) will optimise void g(int *data); void f() { int data[N] = { }; g(data); } to stores, for N < 15, as observed at <http://franke.ms/cex/z/WEse7e>. Structures have the nice property in GCC that one can do assignments like struct s { int data[N]; } d; void f() { d = (struct s) { }; } and once again GCC will store for N < 15 <http://franke.ms/cex/z/KM4ced>. > The issue in the original post can be resolved by writing through a > pointer to `volatile char` like this: > > void *memset2(void *s, int c, unsigned int n) > { > volatile char *b = s; > for (unsigned int i = 0; i < n; i++) > b[i] = c; > return s; > } The GCC optimiser will also yield with a small amount of loop unrolling, void *memset2(void *s, int c, unsigned int n) { unsigned int i = 0; char *b = s; while (i + 1 < n) { b[i++] = c; b[i++] = c; } if (i < n) b[i] = c; return s; } and I suppose I will implement something like this in assembly, eventually, for speed reasons. Storing 32-bit longs rather than bytes is faster still, but one complication is that the 68000 has address alignment restrictions for 16-bit and 32-bit load/stores. And then the 68000 has the MOVEM.L instruction, if one wants to max out on larger sizes. :) Fredrik