On 18/10/2019 13:41, Josef Wolf wrote:
On Fri, Oct 18, 2019 at 11:25:53AM +0100, Richard Earnshaw (lists) wrote:
void *
__attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
memset (void *s, int c, size_t n)
{
int i;
for (i=0; i<n; i++)
((char *)s)[i] = c;
return s;
}
Wouldn't
void *memset (void *s, int c, size_t n)
{
return __builtin_memset (s, c, n);
}
be a cleaner solution to this?
Unfortunately, this compiles to a jump to itself. No matter whether I use the
-fno-builtin-memset flag or not.
On most targets __builtin_memset will only compile to in-lined code if
the size is known (and sufficiently small), it's intended for cases
where you probably don't want a loop, but do want to make use of known
size and alignment. It's not expected to be an all-singing all-dancing
memset for this specific CPU.
Writing a good memset can be hard (writing memcpy is even harder) and
compilers rarely do as well as the best assembly code when trying to
handle all the important cases; but they can do better in the limited
conditions where the size and alignment are statically known since many
hard-to-predict branches can be entirely eliminated.
So in most cases, you *want* the compiler to call memset if the
operation cannot really be optimized.
R.