Re: binary reproducibility and c++ default allocators

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



galathaea wrote:
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-
At work, we use version 2.96 with Redhat Linux 7.3.  We work in
an embedded industry that requires complete binary
reproducibility (BR) of compiles so regulators may verify the
source code reproduces the images they are given.
Consequently, we avoid anonymous namespaces and other known
violators of reproducibility.


We have currently found a strange occurance BR breakage.  We
have a file that is completely empty save for includes of
headers (many headers through one main include).  The .o
produced differs each build.  Tracing the translation unit
through the build process has shown that the assembly output
differs on a few lines related to allocators.


They have the format:

.stabs "_GLOBAL_.I._S_chunk_alloc__t24__default_alloc_template2b1i0UiRiSomefilepath.cppLyTUed:
f(0,21)",36,0,1256,
_GLOBAL_.I._S_chunk_alloc__t24__default_alloc_template2b1i0UiRiSomefilepath.cppLyTUed
        .type    _GLOBAL_.I._S_chunk_alloc__t24__default_alloc_template2b1i0UiRiSomefilepath.cppLyTUed,@function
_GLOBAL_.I._S_chunk_alloc__t24__default_alloc_template2b1i0UiRiSomefilepath.cppLyTUed:

This issue appears above some assembly that sets up a stack
frame, calls __static_initialization_and_destruction_0, then
resets the stack frame and returns, but I am not sure if that
is relevant (I found some older messages which approached
somewhat unrelated issues with static template members which
made me think mentioning it might be helpful).


My main question is concerning the Somefilepath.cpp (fictitious
here) that points to the compile path of the almost empty cpp
and, in particular, the "LyTUed" sequence appended to the end.
That is the sequence which differs in each compile.
  

I am trying to find out what is being done to cause this code to
be generated so we know what to avoid in the future.  It is not
simply use of the standard library, as vectors and strings are
used throughout the code without any problems.  If anyone knows
why the GUID characters and pathname are appended for these
functions into the assembly output, it would really benefit my
company.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-

First, I appologise for the original formatting.  I've inserted
endlines above.

Now, compiling the translation unit in question and commenting
out in a binary search sections of the includes, I was able to
get more information about the problem.

Apparently, the problem is found in a header file with const
array of strings set by an initialiser list.  I have created
a minimal set of 3 files to recreate the issue:

 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-
// binaryTest.h

#include <string>

std::string const stringArray[4] =
{
  "one",
  "two",
  "three",
  "four"
};

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-
// binaryTest.cpp

// I understand that technically the standard requires ostream as well
// but that is a different issue
#include <iostream>
#include "binaryTest.h"

int main()
{
  std::cout << "Hello world!" << std::endl;
}

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-
// other.cpp

#include "binaryTest.h"

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-


As I understand the c++ standard, constants at file scope should
be given internal linkage.  However, this is an "array of const"
which, similar to the pointer case, I would presume is different.
However, the two cases I would have assumed possible would be:
a)  internal linkage - no symbols generated externally for other.o
and thus no chance for symbols breaking binary reproducibility
b)  external linkage - compilers mangling convention inserted for
external linkage symbols into the .o according to the 2.96 ABI
deterministically with no breakage of BR.  The linker should
complain about violations of ODR, though.

The strangest part of the issue is that the code has been a part
of our projects for the past 18 months with no BR breakage.  Only
recently, with the addition of an empty-save-for-single-include file
(which was just added to CVS and the Makefiles for future development)
did the problem appear.  Other files have included the code with no
problems.

As expected, the problem goes away by making the array extern and
placing the initialisation in a single cpp.  This is probably the
best "solution" to the problem, as we don't have any need for local
copies per translation unit, but I am not looking for a "solution"
(just removing the empty file from our build process also "fixes"
things).  I am looking for some understanding of some type of rule
I can use in the future to avoid BR failures.

The second I place:

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-
void f() {}
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-

into other.cpp, the BR problems dissappear, which disturbs me.  I
don't like to see problems hide themsleves.  This must be related
to the problem not appearing over the past year and a half.

The only hypothetical explanations I have devised for what is
currently being seen is that the code is passing through the first
few stages of translation (the various preprocessing and initial
parsing stages) and generating some internal tokens.  Only if actual
executable code is encountered, though, do these structures get
an application of some form of cleanup during the final stages of
translation which remove the generated tokens.  However, I must be
missing something, because I still do not understand how these would
appear in the final executable.

I would appreciate any insight into this problem.

galathaea

_______________________________________________
Join Excite! - http://www.excite.com
The most personalized portal on the Web!

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux