Re: strlen

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jonny,

On 7/9/21 12:50 PM, Jonny Grant wrote:
> 
> 
> On 08/07/2021 12:06, Alejandro Colomar (man-pages) wrote:
>> On 7/8/21 12:07 PM, Jonny Grant wrote:
>>> Thank you for your reply.
>>>
>>> We can't guarantee safestrlen() won't be called with NULL. So because strlen() itself doesn't check for NULL in C standard we'd need to call the wrapper so that NULL can be checked for.
>>>
>>> I'd like to avoid the compiler removing certain execution paths.
>>> I'd rather keep all code paths, even if they are not taken, just in case a NULL pointer creeps in due to an external device that is connected to an embedded system.
>>>
>>>
>>> Probably this would work:
>>>
>>> size_t __attribute__((optimize("O0"))) safestrlen(const char * s)
>>> {
>>>      if (NULL == s) return 0;
>>>      else return strlen(s);
>>> }
>>
>> I don't think you don't need that.  Unless there's a bug in GCC, it shouldn't optimize that path unless it is 100% sure that it will never be called.
> 
> That is good, so the code will always be kept! As compiler will never find all calls to strlen() and be sure those calls are never NULL.

Not always.  If you inline that function, that path may be removed in
some calls, if the compiler knows better than you that it can.  My point
is that you shouldn't care; your code is completely legal, and whatever
the compiler decides to do will also be legal (no undefined behavior,
and no crashes).  If it optimizes, it will be a good thing that you
shouldn't prevent.

If the compiler does otherwise, that's a bug in the compiler, and
something you can't solve by writing different code or preventing
optimizations, or at least there's no guarantee about it, since it's a
bug.  But I don't believe you'll find a bug in this case.  So please,
trust the compiler, at least when using perfectly defined behavior.  And
if you don't trust the compiler, which is perfectly reasonable, test it,
but don't try to workaround bugs that don't exist.

> 
>> Moreover, I recommend you to optimize as much as possible.
>> Even though NULL is possible in your code, I guess it's unlikely.
>>
>> Also, calling a function safe is too generic.
>> I'd call it with the suffix null, as it act different on null.
>>
>> Also, I recommend avoiding 'size_t' (and any other unsigned types, BTW).
>> See <https://google.github.io/styleguide/cppguide.html#Integer_Types>.
>> Use the POSIX type 'ssize_t'.
>> That also allows differentiating a length of 0 (i.e., "") from an invalid string (i.e., NULL), by returning -1 for NULL.
>>
> 
> https://man7.org/linux/man-pages/man3/strlen.3.html
> size_t strlen(const char *s);
> 
> I'd rather not change the return type from POSIX size_t in any wrapper of strlen. Unless it is part of C11 Annex K style standards improvement.

That's a historical accident.

A long time ago (much before I was born, and much before the first
standard, I mean in the early times of K&R C and Unix), unsigned types
were used more than they should, and the first C standards (I mean
ANSI/ISO standards (i.e., C89, C99, ...), not POSIX), with a lot of
already existing code, didn't attempt to change the language, but to
annotate common usage.

In POSIX.1 there's a mix, because POSIX has the type 'ssize_t', which is
not defined by the C standards.  POSIX in general tends to use the
signed type 'ssize_t' for its POSIX-only functions (i.e., not in the C
standards).

Annex K has been an attempt of Microsoft to provide safer functions, but
while there are some functions there that have good intentions, most of
them are just badly designed.  That annex K is DOA, and will probably be
marked as deprecated in C22 (currently C2x).

I think that a standard should not try to design new functions, and
instead just annotate common usage, as they did in the first ones.
Problems like the ones Annex K suffers could have been detected early if
they had been implemented as an extension to some compiler(s) decade(s)
before being standardized.  Therefore, if the implementation passes the
test of time, you standardize it, else not, IMO.  Otherwise, we have a
standard that is declared deprecated in the next version of the
standard, similar to what is happening with the C++ standards (which,
guess what, BTW I recently read that they are undeprecating a lot of C
stuff they deprecated in the first standards).

Using unsigned for anything else than bitfileds and similar stuff is
just *wrong*, as you can read from the Google C++ style guide I linked
before.
Another source you can read is this paper from Bjarne Stroustrup:
<http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p1428r0.pdf>
This is one of the few cases where I agree with something coming from
him.  I hope some day we get a ssizeof operator :-)


Cheers,

Alex


-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux