Re: The price of FHS

"Stephen J. Turnbull" <stephen@xxxxxxxxxx> · Sun, 24 May 2020 03:24:13 +0900

Paul Dufresne via devel writes:

 > Now I do believe the reason you need to give a version to shared 
 > libraries is because of the FHS. Because FHS suggest to regroup 
 > libraries inside a specific directory and/or directories. But if you 
 > have a common directory that contains every packages inside their own 
 > directory, things because simpler because the directory identify 
 > uniquely a library.

However, none of the above is entirely true, and the minor differences
matter.  You don't need to version shared libraries at all; the system
(specifically ld-linux.so) will pick the first compatibly-named
library on the load path.  So you can specify different libraries by
manipulating the load path (as well as several other mechanisms, some
of which may not be available depending on the system's security posture).

The need to version shared libraries is not based on the FHS; it's
that the library APIs or ABIs or semantics differ, so programs that
target those APIs/ABIs/semantics can get what they're expecting (if
they don't, often you get a core dump, or garbage output, or even a
security vulnerability).  We've learned over the ages for security and
reliability reasons that these version variables and checks really
need to be in the library and application respectively, so even if you
wanted to go with a fully directory-based library search mechanism,
you'd still need the version information.  The libfoo.so.X.Y.Z naming
convention turns out to be simple enough to be quite reliable for
system administration, and the rare slip-up is caught by the version
checks in the code.

The .so version *does* uniquely identify a library, regardless of
directory (with some exceptions such as debug libraries that NixOS
also presumably handles).  But according to another poster, NixOS uses
links to populate an application's library directory, so you actually
don't know what those libraries are from their names.  In principle; I
suspect that NixOS enforces naming conventions so that you do know
what they are, as long as you're using NixOS packages.  But in the
Fedora system, to find the library that will pass the runtime version
checks you need to *both* name it correctly *and* be correct, since
the version in the name needs to match the internal version.

And finally, the FHS does provide for I-don't-need-no-shared-libraries
packages: it can't stop you from statically linking your executables
(although GNU libc *really* doesn't like that for some facilities,
like NSS), and it provides /opt for exactly the kind of package
management you propose.  Very few projects use it as far as I know (to
check I'd have to find out what "LANANA" is exactly and look it up,
see FHS if you wish to do so), presumably because of the benefits
provided by shared libraries, some of which are described below (and
of course there's the support that Fedora package management provides
for the FHS but doesn't provide for /opt-style packaging).

All of your statements are "approximately" true (except the statement
that FHS is a reason for library versioning) from the user's point of
view.  However, what you really are discussing is shared libraries
themselves.  If every binary had all its libraries compiled into it,
this "DLL Hell" (to borrow from the Windows world) would never occur.
So, why do we have shared libraries and DLL hell?

1.  Space is limited, both on disk *and in memory*.  A very basic
    library like ld-linux.so or libc.so is likely to have one hundred
    or more concurrent references on a moderately busy personal
    system.  This saves a *ton* of swapping.

2.  Bandwidth is limited.  Upgrading a large number of packages would
    require upgrading each one's copy of shared libraries.  Version
    dependencies means that you need to do that individually (although
    a Sufficiently Smart PMS could check for available versions on the
    system and copy them, you can be sure that will fail sometimes
    because the upstream package distributor has patched the library,
    and perhaps not changed the version to indicate that).

2.  There are often multiple protocols for a given operating system
    feature.  For example, back when I was a developer's egg,
    file-locking was done through three different protocols (at
    least): dotfile, lockf, and flock.  It wasn't actually done this
    way ;-), but if there were a lock.so library, and all running
    processes used the same lock.so, nobody would step on anybody
    else's files.  (Nowadays the OS provides "mandatory locks",
    solving this problem and introducing others.)

3.  Particularly important are security protocols.  We really really
    want all of your (new) processes to upgrade to the latest versions
    of TLS and the latest cipher suites.  Upgrading your libssl.so
    makes all of that possible with one upgrade.  Another example is
    the resolver for various name services (the one that GNU libc is
    so finicky about).

4.  Some programs will try to load a shared object when an optional
    feature is requested, and gracefully fail if it's not found.
    Shared libraries allows the user to decide if they want to
    encruftify their system with that library.

[description of filesystem-based library management system omitted]

You've reinvented library versioning, except you're using the
directory hierarchy as a database, rather than the .so version.  I'm
sure NixOS handles 1 and 2 above (eg, by using links to a common
instance of a particular library version), and maybe 4 (though given
the /opt philosophy, I suspect not).  I have my doubts about 3 though
(see below).

 > So you try the new version, it works.

I don't see how this differs from FHS-style organization, as long as
you build RPM packages, which is best practice anyway.  The only
advantage of the "every configuration option is a file system path"
approach that I can see is it makes doing things "by hand" or roll-
your-own scripting easier.  But by that same fact, you give up
accessing all the experience that makes RPM a complex system, in
particular, solving the problem of maintaining multiple versions of
programs simultaneously.  You may prefer doing it yourself, and that's
fine; but then you won't be as happy with a Fedora (RHEL, Centos)
system as *most* others are, and you're unlikely to convince them to
change Fedora and friends.

If you want multiple versions of the application, sure, you have to do
some fiddling, but that's easy enough with the same device: you attach
a version string to the program's name.  Even if the application's
native build system doesn't allow that kind of configuration, RPM can
help you with it, I believe, and if not of course you can script it.

 > If nobody use programA_version1, you can delete
 > pkgs/programA_version1 and pkgs/libX_version1 now.

The process of removing unused library cruft works fine with package
managers managing FHS systems with shared libraries mostly found in a
single directory as well.  Most (all?) libraries are on your Fedora
system because some package you requested to be installed required
them.  If that package specifies a versioned dependency, that version
of the library will be required.  Once that package gets upgraded,
that requirement will be removed.  Once the last such requirement gets
removed, so does the library.  And you get all the benefits I
described above.

Of course there's the problem that some package you want to build from
source doesn't work correctly with the default (ie, most recent)
version of the library on the system.  Then you need to get the
appropriate version, and make sure the application links to it.  The
least-effort way to do this is to write a spec file with the version
dependency and use the system package manager to build and install a
package, which then requires (and installs) the appropriate version of
the library, because this is a common situation.  Of course there's
extra effort if you need to learn to write the spec file, but not that
much extra effort.  (If the library version doesn't exist in the
package repository, of course you'll need to get and build that too,
but that's true for both ways of organizing package installation.)

So in sum, (1) I really don't see what the problems you're worrying
about are (except for the need to learn how Fedora handles those
situations), and (2) there may be serious defects to per-package
library management, such as enforcing system-wide conformance to
recommended security protocols and encouraging maintainers to update
their packages to use new versions of required libraries.  I suspect
that it's really difficult for a per-package hierarchy to discourage
users from using old versions of libraries that are strongly
deprecated due to security vulnerabilities and the like when the
package maintainer (or user!)  can just overwrite the link with an old
library version (or a link to it!)

Steve
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx