Hello.
For reasons I won't bore you with, we compile PostgreSQL from source
rather than use the standard packages for some of our databases.
We've compiled numerous PostgreSQL versions, from 11.1 to 14.4, using a
fairly generic and not particularly complicated compile script that has
worked successfully on dozens (possibly hundreds, I don't keep track :)
) of redhat boxes using numerous different versions of RHEL.
This script has worked without incident for *years*. Until last week,
where we tried to compile PostgreSQL 12.9 on an RHEL 7.9 box, where it
bombed out with an error we have never seen before.
To be honest, I'm not sure what's going wrong. I am by no means a Linux
sysadm or compile expert. I just run the script (and a variety of other
post-build steps ...)
Our basic process:
1. Install pre-requisite libraries/packages:
yum install pam-devel
yum install libxml2-devel
yum install libxslt-devel
yum install openldap
yum install openldap-devel
yum install uuid-devel
yum install readline-devel
yum install openssl-devel
yum install libicu-devel
yum install uuid-devel
yum install gcc
yum install make
2. Create a user to compile the source and own the software. For
example, pgbuild
3. Build a couple of directories owned by the build user for the
destination, source, etc. We then run the following script under the
build user.
targetdir={directory to install postgresql into}
sourcedir={directory where the postgresql unzipped and untarred tarball
has been located}
builddir={temporary build directory}
port={port number}
rm -Rf ${targetdir}
rm -Rf ${builddir}
mkdir ${targetdir}
mkdir ${builddir}
cd ${builddir}
${sourcedir}/configure --prefix=${targetdir} --with-pgport=${port} \
--with-openssl \
--with-ldap \
--with-pam \
--with-icu \
--with-libxml \
--with-ossp-uuid \
--with-libxslt \
--with-libedit-preferred \
--with-gssapi \
--enable-debug
rc=$?
if [ $rc -ne 0 ]
then
echo "#### ERROR! Configure returned non-zero code $rc - press RETURN to
continue / Ctrl+C to abort"
read ok
fi
make world
rc=$?
if [ $rc -ne 0 ]
then
echo "#### ERROR! make world returned non-zero code $rc - press RETURN
to continue / Ctrl+C to abort"
read ok
fi
make check
rc=$?
if [ $rc -ne 0 ]
then
echo "#### ERROR! make check returned non-zero code $rc - press RETURN
to continue / Ctrl+C to abort"
read ok
fi
make install-world
rc=$?
if [ $rc -ne 0 ]
then
echo "#### ERROR! install-world returned non-zero code $rc - press
RETURN to continue / Ctrl+C to abort"
read ok
fi
So, pretty straightforward stuff. Run configure, make world, make check,
make install-word and a little bit of basic error checking after each step.
For years we've been able to run this script without issue, until last
week where the configure failed with the following error on one of our
servers. After the usual hundreds of lines of text configure output the
following:
checking for library containing gss_init_sec_context... no
configure: error: could not find function 'gss_init_sec_context'
required for GSSAPI
And then bombed out with rc 1. Rest of the script aborted due to our
error checking.
Bit odd, nothing we've seen before on dozens/numerous other compiles
across the enterprise.
Then I spotted that our libraries pre-install doesn't include anything
for GSSAPI. Bit of a bug in our pre-reqs step, perhaps we've got away
with it previously and this one server in our whole estate doesn't have
GSSAPI. I need to figure out how to install GSSAPI, but that's a bit of
a faff and I need to get this build tested in a hurry.
So I simply removed the --with-gssapi, and tried again.
AND IT FAILED AGAIN.
This time it failed claiming it couldn't find the ldap library. Which is
most -definitely- present.
I have no idea what's going on at this point. We have *never* had any
issues like this. This script/process has been in place for years and
we've never had any issues with it.
It gets weirder.
The compile step and make world steps work perfectly if the script is
run under root. Though, of course, the make check step fails. Running it
under root was inadvertent, but the fact the compile and make steps
seemed to have run successfully was a bit of a surprise.
So a fairly basic script that has been used for years suddenly fails on
a fairly generic RHEL 7.9 server.
I am no compilation expert. Obviously. Have I mised something basic? As
I said, we've not seen problems like this before. Could there be some
sort of issue on the box's configuration? If it works for root but not
our usual build user could there be a user config with our account? Can
anyone offer any insight on what I need to check? At the moment it all
seems somewhat ... mystifying.
I am assuming there must be something wrong with the box/our
configuration somewhere, but where to look? If anyone can help - even if
it's to tell me I'm an idiot for missing one or more incredibly basic
things somehow - I would be very grateful.
Many thanks.
Regards,
M.
--
Martin Goodson.
"Have you thought up some clever plan, Doctor?"
"Yes, Jamie, I believe I have."
"What're you going to do?"
"Bung a rock at it."