Re: how to make code stay invariant

Rolf Schumacher <rolf@xxxxxxxxx> · Wed, 26 Jul 2006 00:15:56 +0200

Thank you John.

I need two more links. Please see below.

Rolf

John Carter wrote:
That's why you do Test Driven Development with a test harness to run all
automated tests.
agree, that's the prefered solution

It really really does work in very real world environments with even
larger code bases. It really really does improve design. You really can
rerun all tests on 100000 lines of code.
In complex and safety critical command and control systems
you rely heavily on good simulation in testing. The problem
of trusting the simulators has often to be solved by giving
the system a trial period after going operational.
That's the most expensive art to test.
Even for that problem a solution to "small changes" would count a lot.

If you can't rerun all tests, it is quite simply because you designed it
wrong. You didn't design it for testability.
trusted systems are often old, hard to exchange legacy systems ...

And in a safety critical App that is gravely remiss.
100% agreed.

Places to start reading are...
http://www.objectmentor.com/resources/bookstore/books/welc/
ordered that, thanks
http://www.agiledata.org/essays/tdd.html
I'm currently coaching a first project with steps to Agil Methods,
so I'm aware of what you mean. However ....

We have to come up with some better idea.

TDD _is_ the better idea.
You should do that first, ok. But it's not enough, there could be more.

For now I'd like to focus solely on dynamic errors.
Errors to happen while compiling, linking, loading and running.

You know something, I have been bitten by some compiler bugs in my time.

Pretty rare, but they happen.
We just had to recall projects in an expensive way
upon a difference in gcc compiling for SUN
and for Intel. (const in parameters)
Debuggin was done on a SUN, delivery was for Intel.

I would estimate looking at my current project (200000+ LOC, a man
decade or two of development, real time embedded C) that we have about 3
full orders of magnitude more programmer bugs than compiler bugs. None
of them were sporadic. A correct program simply failed to compile.
If you manage to overcome that, you're a professional better than all 
the rest.
Even though, what do you got then?: you're able to see the sporadic errors.
E.g. critical regions failures, state machines without conflict 
resolution, ...
all programmers limited accuracy as well.

To your magitude: I'm estimating: an average programmer
puts a failure in the code before testing every 20th decision on average.
One decicion per 10 LOC:
That's 0.005. After thoroughly module testing 1 out ~50 are not found.
That's 10**-3. Integration, validation and system-integration puts this to
10**-6. In safety critical systems we have to demonstrate (!) 10**-9.
For example, systems in an atomic power plant
have to be secure to 10**-13 (asaik). They are not allowed to add more risk.
You have to have risk reduction technologies because you can't reach that
figures with software.

We have never been bitten by linker bugs at all. Well, admittedly
writing gnu ld script is actively user hostile, but it either worked or
it didn't.
Do you count "oh, sorry, somehow I used an outdated make file"?
And "May be the SCCS had an error."

We have had lots of loader bugs, but then for various strange reasons,
we wrote our own. In all my years programming I have never been bitten
by an OS loader bug. There is a moral there...
That's it, if I go for checksums I have to write my own loader.

An error I'd like to uncover is: I'm linking on a PC/XP
and somehow a bit changes just before the linker
packs the object to be written to the disk. Checksum is ok,
the object is bad.

Wow! That is such a low probability risk compare to Good Old Human stuff
ups, I wouldn't even give it a moments thought unless I had actually
seen it happening once.
We estimate that probability by 10**-5 .. -6. At least we are not able 
to show
better figures. If you know a way to demonstrate better figures ...

If you really having such errors you have a buggy linker, time for a
newer (or older) version fast, or you have buggy hardware. ie. Fix the
tool, don't create a kludgy workaround patch around the broken tool.
That's important now: I never had such an error. And it should never happen,
at least not uncovered. But we are only one company. Take Ariadne or Skylab,
thousand companies are delivering software to that project.
What would you do if you are in charge for safety at the purchase 
department?
Believe the suppliers that it never happened in the past?

Just the fact that you can think about an error draws the responsibility
to give an accepted figure for it: 1. HAZOP, 2. FMEA at least FTA,
you do not have any statistics. It hasn't to be real at all in any past.

If I had a checksum from last linking and made no change
I could point to the failure immediately, e.g. at load time.

Some (targets/versions) of the GCC linker do relaxation passes. ie.
Change long jumps to short jumps, change long references to short
offsets. And since the size of the code has shrunk, they do that again,
and again until it converges.
Can I switch that off?

Basically you want each module to be a DLL/sharable object so the linker
does the absolute minimum of fix ups.

You also need a strict acyclic dependency graph between the sharable
objects and then link each layer with lower layers.

Follow the standard tricks to make a sharable object / DLL.
Now that's it: I need a link here to update my knowledge.

You still need the objdump tricks I mentioned to pull just the sections
you care about out.
dito

The point that I'm asking is,

Somehow your mailer lost everything you wrote after this point in your 
post!
Sorry I was interrupted and couldn't finish.

What I wanted to tell you is,
that you're completely right with the example of the Unix loader
separating tasks by means of address space.

I have to look at a module as a task that takes messages and respond
with messages. As in UML sequence charts.

What is the easiest way to implement a messaging system e.g. by macros
for programmers that like to use function calls?

This question seems to be just another way to look at the problem.

(I had a bit more text here, but as far as I remember that's the core.)

kind regards

Rolf
Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature