--- John Carter <john.carter@xxxxxxxxxx> wrote: > On Thu, 24 Jan 2008, Ted Byers wrote: > > > You're half right. If your program uses library > X, > > and that library has a subtle bug in the function > > you're using, then the result you get using a > > different library will be different. The fix is > not > > to ensure that you use the same library all the > time, > > but to ensure your test suite is sufficiently well > > developed that you can detect such a bug, and use > a > > different function (even if you have to write it > > yourself) that routinely gives you provably > correct > > answers. > > Alas, Reality bites, we all suck, nobody on planet > with a non-trivial > product has perfect test coverage of code and state, > and we all have > bugs. > True. No-one is perfect. i never said that anyone had achieved perfection. At best, perfection is a state one must strive for, but which can never be achieved. But that doesn't stop us from beginning with unit tests, and proceeding to integration tests and usability tests, &c., and adopting a protocol that requires the test suite to be expanded every time a new bug is found, and prohibiting new code from being added to an application's code base unless all existing tests pass. Such a practice generally results in the number of bugs per line of code diminishing through time, although the total number of bugs may not. You never stop trying when the kind of application you're helping develop could have catastrophic consequences, for the company for which you're developing it, or for people using it, or affected by facilities where it is used, should your application fail in a bad way. > And even if you have really really good coverage, > you seldom have the > time to rerun _every_ test after every change. > True. But standard practice here is to run the full test suite, with no failures, before code is committed to the code-base. That may be overkill for an application supporting only drawing cartoons, but in other industries, where real and significant harm can be done if an application is wrong, it is a price no one questions. > So given how much reality sucks, one of eminently > practical things you > can do is reduce the variance between what you have > tested and what > you ship. > Right. So what is the problem with not upgrading all your development machines to a new release of the tool chain you're using until you have proven the new version of the toolchain won't break your code? Or that the new version has found a bug in your code the previous version didn't (when it produces results inconsistent with a previous version of your application), and that you have fixed the bug and extended your testsuite accordingly? > Test what like you fly, fly what you test. > Right. All tests, for the kinds of applications I develop, in the test suite must pass before the application can be released for general use (generally by consultants with doctorates in some aspect of environmental science). > And that applies to shipping products to customers, > it applies to > internal products like shipping cross compilers to > colleagues. > Right, we upgrade ASAP when a new release is available for our development tools, but this process includes stress testing them, especially to prove that they don't break existing code. If a test in our existing suite fails upon using a new tool, we have no option but to investigate to see if the problem lies with something that was missed in our previous testing (in which case, the bug revealed is fixed and additional tests developed to improve our QA), or with something in the new tool (for which we must find a solution). All this must be done before a project can be migrated to the new tool. But we do it in anticipation of relatively continual improvement in our tools as new releases become available. > As I said, Reality truly sucks. > Yup. There is a reason it is more expensive to develop applications in some disciplines than it is in others. > Hint: C/C++ based reality sucks even more since, > unless you test > heavily under Valgrind, most code has subtle > uninitialized data bugs > that often don't fire under even the heaviest > testing. One of the > reasons I like dynamic languages like Ruby. > This is debatable, and this probably isn't the forum to debate it. Each programming language has its own problems, and some problems transcend the language used. What really matters is the experience and discipline of the team doing the work, including especially the senior programmers, architects, team leads, &c.: people who know well the potential worst case consequences of a bug in the application they're developing, and design and implement accordingly, with due attention paid to QA and testability. No one will be too upset if a tool used for animation in the entertainment industry occasionally fails (apart, perhaps, from the people who paid for it, or for a good animation), but if an application could result in the loss of life or an adverse effect on someone's health, should it fail (e.g. an application used in aircraft, such as the autopilot or the navigation software, or in medicine, or in risk assessment in environmental protection), one goes the extra mile to try to ensure such failures don't happen. Good QA is more about the people doing the work, and the protocols they use, than it is about the tools they have at their disposal. This is part of why I tried to explain to the OP that instead of going through major hoops on your developer's machines, you have a smaller team working to assess the new tool, or suite of tools, and deploy it to the core developers only after you have proven that the new tools produce correct code when used on your existing application and test suite. Once you have THAT proof, you can proceed confidently with a routine deployment of the new tool on all the developer's machines. If there is insufficient manpower or time to do it right, then don't upgrade until you do; and recognize that if there is always insufficient manpower or time to do it right, then those paying to have it done can't really afford to pay to get it done right (which is a scary notion to me, with the kinds of applications I routinely develop). This is a protocol that will likely be more efficient than one in which a major effort is put into altering all your developer's machines to use the same versions of the same tool chain. I try to keep my tools current, expecting continual improvement in their quality, but I would never go through the kinds of hoops the OP described as that struck me as counterproductive: time not spent either developing new code for the application or trouble-shooting the combination of the new tool chain with the existing codebase and testsuite. I can work around deficiencies in the tools I use, since I know my tools, and while I upgrade my tools as soon as practicable, I don't do it until I know the new tool won't break my existing code, and if it does, I investigate to see where the problem really lies: and if it is in my code, I fix it and if it is in the new version of the tool, I develop a solution for the problem. Only once I know the code I get from the new tools is correct do I proceed with an upgrade. Cheers, Ted