Ok, John. Well done. I think I can follow your thoughts. Let's try to put in my figures into "q is greater than apM(1-d)" You can't lower p below a certain point. And all depends on a and d. If I get some tool to produce interfaces I can lower a a lot. If the interfaces are really simple and regular I've got a better d for interface code than for average code in tests. (d depends on the intelligence in the code, the more simple the code the higher becomes d, it's harder to make undetected faults in simple code) And remember we're talking about well tested code to highest safety level (ICE61508 calls that SIL4) so d is near to 1, state of the art, the best you can think of (at least it is stated so and I'm working on that the whole day). A guess is that a failure showing up from p(1-d) in safety critical code is in about the same magnitude as the probability that a bit fails in RAM or on disk randomly and not caught in an average PC. It happens in both cases. (we're doing something extra 'cause we are required for better) Beside undetected hardware faults during software production the question is how many failures has a linker? Or a copy program, or the OS running the copy program? What is the probability that this will effect my linking and copying? The less we run that programs with unknown quality the better off we are. The idea is: If I have once positively tested (a binary object of) a module I'd like to bank on that tests as long as possible. And, John, do not forget the task: "just make a small change to a big software". We can put that in numbers, too: 1MB of code is not a big program, even not in embedded systems today. A module may contribute with 1k bytes, because the majority of 1MB are libs and the like. Modules are small and simple. So there's another 0.001 reduction to q. I'm still convinced that it would be good to have a checksum for 0.999MB of code that stays invariant compared to 0.001MB code that may introduce yet unknown errors. Also one other point has its importance, that is - with checksums - I'm able to exclude (0.001 for small changes) handling errors. accidentally coming up with older versions, older failures, ... SCCS and scripts are preventing all these failures to certain extend. (while introducing their own) But if the programmer is in a hurry - and they constantly are - how would you estimate the probability that they compensate an error message from a script with manual intervention? How would you estimate the probability for undetected errors from that? (you know such excuses: "Shit, I thought that ...") If I have a checksum I'm (about) absolutely sure it's the same thing as the last time. kind regards Rolf John Carter wrote:
Ok, so let's threat model that.... Let's say probability of bug per line of code is p. Let's say the number of bugs of the sort of bug you are trying to prevent (link time corruption bugs not found on initial test) is q. Let's say the probability of bug removal from ordinary code (by inspection, by test, by static tool etc.) is d So the number of bugs in the ordinary code (as written) is pN where N is the number of lines of code in the program The number of bugs after bug removal is with link time / corruption bugs is pN(1-d)+qNow you have introduced a further M lines of interface code. Assume approximatelythe same bug rate, maybe less. Let's say the bug rate in the interface code is a*p where a is a number between 0.1 and 1 So you now have (pN+apM)(1-d) == p(1+aM/N)N(1-d) So compare the two bug rates... pN(1-d)+q vs p(1+aM/N)N(1-d) or pN(1-d)+q vs pN(1-d)+apM(1-d) So whether you have gained anything from this activity depends on whether q is greater than apM(1-d) You will have to plug you own numbers into that. My guess is the answer is a resounding "No!" since q is so very small compared to pM. On Sat, 29 Jul 2006, Rolf Schumacher wrote:Hi, John. for the first place I found a much simpler solution. I checked it with gcc -fPIC, and than applied prelink. Consider a module m1 calling function o2 in module m2. If we define an interface compilation unit for m2, say m2if, that implements o2if just calling o2 and m1 is not using o2 anymore instead of o2if, m1 stays invariant as long as the interface is invariant. I can do as much "small changes" to m2 as I like, the checksum of m1 (in memory and on disk) stays invariant. in code now, prior to invariance: m1.c: #include "m2.h" int main(void){o2();} m2.h: void o2(); m2.c: #include "m2.h" void o2() {printf("hello world");} and now after changes for invariance: m1.c: #include "m2if.h" int main(void){o2if();} m2if.h: void o2if(); m2if.c: #include "m2if.h" #include "m2.h" void o2if(){o2();} m2.h: void o2(); m2.c: #include "m2.h" void o2() {printf("hello world");} Conclusion: The object code invariance is gained just by coding rules.Introduce of object code invariance is applicable even to existing software.Benefit: all my module tests to m1 apply in the new software regardless of changes to m2. I haven't to repeat them. Remark: For reduction of validation tests of the software product as a whole against requirements I still need a reliable impact analysis in order to reduce tests. Invariance doesn't help here. Code object invariance gives evidence only for code integrity on the invariant part of the software, nothing more. However, that's still a lot. I'll do that! Thanks for your great help and all the philosophical hints.I wouldn't have done without even if the resulting solution is that simple.kind regards Rolf John Carter wrote:Hmm, you probably should be scanning Miller's handy... http://www.dwheeler.com/essays/high-assurance-floss.html High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS)... with Lots on Formal Methods On Wed, 26 Jul 2006, Rolf Schumacher wrote:Pretty rare, but they happen.We just had to recall projects in an expensive way upon a difference in gcc compiling for SUN and for Intel. (const in parameters) Debuggin was done on a SUN, delivery was for Intel.Test like you fly, fly what you tested... But hmm, you said debug not test... So I think there is more to that issue than meets the eye...In safety critical systems we have to demonstrate (!) 10**-9. For example, systems in an atomic power planthave to be secure to 10**-13 (asaik). They are not allowed to add more risk. You have to have risk reduction technologies because you can't reach thatfigures with software.I'm reminded of the Bad Old Days when there were MilSpec computers. Until they realized that the sheer weight of consumer COTS products meant that what was available from the corner store was... * Way way cheaper. * Way way faster. * And much more reliable! Happened again with handheld GPS during the Gulf War. The COTS /Consumer GPS's were just so much better than the MilSpec ones (even withthe delibrate signal fuzzing!!) that they gave up and used the COTS.The other thought that comes to mind is a variant of a very old joke....Patient to Doctor, "Doctor! Doctor! I need to be incredibly hugely impossibly painfully costly reliable to do this." Doctor, "Well don't do that then."Just the fact that you can think about an error draws the responsibilityto give an accepted figure for it: 1. HAZOP, 2. FMEA at least FTA,you do not have any statistics. It hasn't to be real at all in any past.Wow! That is really Amazing! You are _so_ deep in the Dilbert Zone! Do you _ever_ see sunlight there? http://www.dilbert.com/comics/dilbert/archive/dilbert-20060724.htmlSome (targets/versions) of the GCC linker do relaxation passes. ie. Change long jumps to short jumps, change long references to shortoffsets. And since the size of the code has shrunk, they do that again,and again until it converges.Can I switch that off?Only applies to very few CPU's, don't know which one you are using. I met it on the HC12. Search "info gcc" for "relax".Basically you want each module to be a DLL/sharable object so the linkerdoes the absolute minimum of fix ups. You also need a strict acyclic dependency graph between the sharable objects and then link each layer with lower layers. Follow the standard tricks to make a sharable object / DLL.Now that's it: I need a link here to update my knowledge.http://www.dwheeler.com/program-library/Program-Library-HOWTO/x36.html http://people.redhat.com/drepper/dsohowto.pdf In fact Drepper's whole page is a gold mine of detailed info on ELF. http://people.redhat.com/~drepper/ In fact I'll make a wild guess.... If you really understood all the niches and corners of ELF, which is quite a large and hairy domain, what you want is already in there somewhere.You still need the objdump tricks I mentioned to pull just the sectionsyou care about out.ditoinfo binutilsWhat I wanted to tell you is, that you're completely right with the example of the Unix loader separating tasks by means of address space. I have to look at a module as a task that takes messages and respond with messages. As in UML sequence charts. What is the easiest way to implement a messaging system e.g. by macros for programmers that like to use function calls?Make it simple to use, complex == more lines of code == programmer mistakes. The one we are using involves declaring and packing and unpacking structs all over the place. Yuck! Tedious and error prone. I itch to rewrite using a simple convention that looks like an ordinary function declaration, definition and reference. And then add a bit of Ruby code generation magic to generate a header pulled in by the client and a header to be pulled in by the server. Oh,and glue it together with a small, possibly entirely non-portable bit ofC that understands varargs to serialize the arguments across the messaging interface. I bet I can get a huge reduction in code size, much simpler, much more reliable and better code. John Carter Phone : (64)(3) 358 6639 Tait Electronics Fax : (64)(3) 359 4632 PO Box 1645 Christchurch Email : john.carter@xxxxxxxxxx New Zealand Carter's Clarification of Murphy's Law."Things only ever go right so that they may go more spectacularly wrong later."From this principle, all of life and physics may be deduced.John Carter Phone : (64)(3) 358 6639 Tait Electronics Fax : (64)(3) 359 4632 PO Box 1645 Christchurch Email : john.carter@xxxxxxxxxx New Zealand Carter's Clarification of Murphy's Law."Things only ever go right so that they may go more spectacularly wrong later."From this principle, all of life and physics may be deduced.