Re: strncpy clarify result may not be null terminated

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[bouncing a copy to linux-man with PDF attachment stripped]

Hi Alex,

At 2023-11-08T16:07:42+0100, Alejandro Colomar wrote:
> I understand your point of view, but disagree with it.  Deprecation by
> ISO C or POSIX takes very very long.  We had gets(3) for decades until
> they realized it should be removed from the standards.

I think it likely that the humans involved in the decision-making
processes realized that gets(3) _should_ be removed a long time before
it actually was.  It is often difficult to get to the truth of why there
is so much inertia, particularly when large commercial vendors are
involved; such entities have long traditions of opacity.

Sometimes it is because they send relatively clueless people as
representatives to the standards body, because they don't value
standards development as "real work" (how does it generate profit?),
because it's a handy place to dump someone who's been awarded a
sinecure--or who annoys many colleagues but isn't worth the effort to
fire, or because that person is on an unstated mission to frustrate a
market rival and doesn't care what the collateral damage is.

My favorite example of the last is when Groupe Bull sent a fool[1] to
the ISO 8859 standardization group.  DEC's MCS (multinational character
set) was a sound candidate to become ISO 8859-1 as-was, but it must have
been thought that this would be "handing a victory" to DEC, so the Bull
representative--one source says it was a Belgian--endorsed disruptive
changes that made the encoding objectively worse for representation of
standard French script.

Sometimes Gallic chauvinism has to take a back seat to giving Maynard,
Massachusetts a poke in the eye.

Source attached.  It's in French.

I therefore think it's beneficial for you to pursue your campaign
against strncpy().  Vested interests cling to interfaces for reasons
they won't disclose, and cargo-cult programmers will employ them for
reasons they don't understand.  One of the fruits of discussions like
these is that we can get the actual technical merits and demerits of
such interfaces on the record.

> So we had it in ISO C in C89 and C99, and only in C11 they realized it
> had to be removed.  POSIX hasn't even removed it yet!  I won't
> hesitate to kill a function just because of bureaucracy.

You can't kill it; implementations will retain it practically forever to
keep old code compiling.  But you can sometimes scare away the cargo
cultists by lighting yourself on fire and waving your arms.

> The standard, especially C89, was just a reflection of the
> commonalities of most implementation.  It was a burden of
> implementations to add new stuff or to remove existing stuff.  Later
> revisions of the standards invented more, though.

And for what it's worth, Dennis Ritchie thought they lost the plot by
doing so.[2]

I admire a great deal of what Ritchie achieved, but I'm not confident he
made the right call there.  One elitist explanation I've seen ventured
is that Bell Labs simply had inherently smarter people than most other
software development shops could gather.  _Maybe_ there is some truth to
that, but I would venture a hypothesis less grounded on individual
characteristics.  The CSRC was a _research_ environment.  It was
emphatically not about measuring productivity by counting lines of code,
or "moving fast and breaking stuff", or how many "Ship It" boxes you've
ticket on your projects in the last year.

Google was pretty explicit that suitability for production-line
code output was a design objective for the Go language.[3]  They had
hired tons upon tons of smart people but found that it was hard to get
their "ship it" metrics satisfactorily high when driving all their newly
hired sheep through the mine fields of C (and especially C++[4])
programming.  An old adage says, "it's a poor workman who blames his
tools".  But when nearly every worker to whom you give a set of tools
struggles with high failure rates, it's time to question the fitness of
those tools for the objective you have in mind.  So Google did, and
attempted to recreate for software engineers what Frederick Winslow
Taylor achieved for factory laborers a hundred years ago.  If there's
less room for individual initiative, creativity, or insight, too
bad--those don't keep the share price up.[5]  You're a grunt.  GBTW.

> In this case, since ISO C has no APIs that use strncpy(3), it could
> (and should) already deprecate strncpy(3) from ISO C.  POSIX still
> needs it while it keeps utmpx(5), because there's no other way to
> correctly write to the fixed-width buffers within struct utmpx.

I would like to emphasize that a fixed-width buffer is inherently an
uneasy fit with C-style strings in the first place.  The major selling
point of null-terminated strings is their length flexibility.  They are
the entire reason we don't use Pascal-style strings, upon which C coders
eagerly spit (too easily, when they embarrass themselves with
strncpy()).  And yet fixed-width buffers are traditionally ubiquitous in
C, especially in the days before the GNU Coding Standards (and
programmers' frequent desires for generality and adaptability) spurred C
codes to use dynamic allocation much more aggressively.

Why were these practices in tension is a language as purportedly shot
through with genius as C was?  Because, in my opinion, it was a bit of
unfinished business in the language.  This is why malloc(3) and free(3)
are managed by the runtime rather than defined in the language proper.
Back in 1970s and 1980s, "everybody knew" that you couldn't have safe
dynamic memory allocation without a garbage collector, and there was no
way to have a garbage collector run deterministically in general, a
fatal flaw in real-time applications.

(Even then, there were alternatives to throwing everything explicitly
onto the heap.[6])

Thanks to particular improvements in compiler development (originally
intended for code optimization), static analysis tools, an influential
(if under-recognized) research programming language called Cyclone,[7]
and a new language--Rust--that is making the fruits of these
improvements available to a wide audience, we're learning to be better
programmers.

...against the resistance of C grognards, who of course vociferously
oppose deprecation of strncpy(3), because (they claim) it never caused
_them_ any problems.

> > Also speaking only for myself, the Linux manpages are welcome to
> > discourage the use of any function that you feel is not a wise
> > choicei for new programs, but the word "deprecated" should be
> > reserved for cases where there really has been a declaration of
> > deprecation by us and/or the standards.
> 
> If a function is deprecated by a standard or other entity, that will be
> reflected in the STANDARDS or HISTORY section.  For deprecation by the
> manual itself, the SYNOPSIS (and BUGS) sections are fine.  In the end,
> the word 'deprecate' isn't any magic.
> 
> 	From WordNet (r) 3.0 (2006) [wn]:
> 
> 	  deprecate
> 	      v 1: express strong disapproval of; deplore
> 
> That term applies to strncpy(3).

Yes, but Zack raises a good point.  Deprecation by ISO, by POSIX, by the
glibc developers, and by the Linux man-pages project are all different
things, and they all have different implications for portability.  It is
helpful for the everyday C programmer to know which of those
implications to infer.

Were I in your shoes, I would use the term "discourage".

"The Linux man-pages project discourages use of strncpy() {for the
reasons listed above, because ...}."

> But yes, we need to make sure that the APIs that need strncpy(3) are
> all deprecated.  If other Unix systems still need utmpx or similar
> stuff, strncpy(3) will still be necessary.

You might also say this: "The deprecated strncpy(3) is mainly used
in conjunction with other deprecated interfaces, like utmpx(5)."

Regards,
Branden

[1] The term "moron" also comes to mind.  Too strong a term?  Just
    applying Hanlon's Razor here.

[2] https://www.computerworld.com/article/2826125/the-future-according-to-dennis-ritchie--a-2000-interview-.html?page=2

    This, followed by his death, is why there's never been a third
    edition of _The C Programming Language_, which I guess continues to
    be a best-seller for its publisher, even though it's not a good idea
    for newcomers to C to learn from it, any more than Kernighan &
    Pike's _The Unix Programming Environment_ is.  (Once you've acquired
    a little historical perspective, they're _excellent_ resources!)

[3] https://go.dev/talks/2012/splash.article

    Just read every sentence containing the word "productive".

[4] https://google.github.io/styleguide/cppguide.html

[5] That has to await your elevation to the C-suite, where more
    marketing dollars will be spent burnishing your reputation as a
    "genius" than any level of personal productivity could conceivably
    justify.  See, e.g., Steve Jobs.  Silicon Valley's thought leaders
    are on a work slowdown, you see--their compensation ratio needs to
    be higher[8] or they won't turn their massive brains to the trivial
    problems of cold fusion or room-temperature superconductors.  Atlas
    ain't shrugging yet, but he's leaning over really far, shooting you
    a meaningful look, and clucking about the dire precedent set by this
    year's UAW strike.  Where are the Pinkertons when you need them?
    And what's Erik Prince up to these days?

[6] https://docs.adacore.com/gnat_ugx-docs/html/gnat_ugx/gnat_ugx/the_stacks.html
[7] https://en.wikipedia.org/wiki/Cyclone_(programming_language)
[8] https://www.epi.org/publication/ceo-pay-in-2021/

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux