I'm having a problem with building the coq package for the new OCaml 4.00.0, and I'm at my wits' end. There were some bad interactions between the new OCaml, camlp5, and coq which I think I have successfully worked around. (It isn't pretty, but it seems to do the job.) But now, after the tools are built, the documentation building step is failing due to seemingly random segfaults in the coqdoc tool. The segfault can be triggered reliably on i386, but I've only seen it once in awhile on x86_64. In fact, I did a build on an x86_64 Rawhide VM today that ran all the way to the end. If I build in koji or mock, coqdoc (which is invoked lots of times during the build) eventually segfaults, but on different invocations each time. I did a mock build, waited for it to fail, then did "mock --shell" to see what I could learn. If I run coqdoc inside of gdb, it *never* segfaults. I've run it and run it, and it never, ever has segfaulted. I even did "set disable-randomization off", but it still ran successfully every time. If I run it under valgrind, it *always* segfaults. Here is some sample output from valgrind (after manually rebuilding coqdoc to get debugging symbols; the Makefile strips it once it is built): ==25130== ==25130== Conditional jump or move depends on uninitialised value(s) ==25130== at 0x809E1C5: invert_pointer_at (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x809E499: do_compaction (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x80903DA: caml_major_collection_slice (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x8090AC3: caml_minor_collection (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x808EBD3: caml_garbage_collection (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x809FC0D: ??? (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x8076DF6: camlPrintf__kfprintf_1356 (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x80559D1: camlOutput__printf_1012 (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== ==25130== Use of uninitialised value of size 4 ==25130== at 0x809E1EB: invert_pointer_at (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x809E499: do_compaction (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x80903DA: caml_major_collection_slice (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x8090AC3: caml_minor_collection (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x808EBD3: caml_garbage_collection (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x809FC0D: ??? (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x8076DF6: camlPrintf__kfprintf_1356 (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x80559D1: camlOutput__printf_1012 (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== ==25130== Use of uninitialised value of size 4 ==25130== at 0x809E1F2: invert_pointer_at (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x809E499: do_compaction (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x80903DA: caml_major_collection_slice (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x8090AC3: caml_minor_collection (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x808EBD3: caml_garbage_collection (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x809FC0D: ??? (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x8076DF6: camlPrintf__kfprintf_1356 (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== by 0x80559D1: camlOutput__printf_1012 (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== ==25130== Invalid read of size 4 ==25130== at 0x8058AE3: camlOutput__fun_2066 (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== Address 0x804 is not stack'd, malloc'd or (recently) free'd ==25130== ==25130== ==25130== Process terminating with default action of signal 11 (SIGSEGV) ==25130== Access not within mapped region at address 0x804 ==25130== at 0x8058AE3: camlOutput__fun_2066 (in /builddir/build/BUILD/coq-8.3pl4/bin/coqdoc) ==25130== If you believe this happened as a result of a stack ==25130== overflow in your program's main thread (unlikely but ==25130== possible), you can try to increase the size of the ==25130== main thread stack using the --main-stacksize= flag. ==25130== The main thread stack size used in this run was 8388608. ==25130== ==25130== HEAP SUMMARY: ==25130== in use at exit: 146,066,664 bytes in 313 blocks ==25130== total heap usage: 4,692 allocs, 4,379 frees, 294,975,004 bytes allocated ==25130== ==25130== LEAK SUMMARY: ==25130== definitely lost: 0 bytes in 0 blocks ==25130== indirectly lost: 0 bytes in 0 blocks ==25130== possibly lost: 140,299,840 bytes in 228 blocks ==25130== still reachable: 5,766,824 bytes in 85 blocks ==25130== suppressed: 0 bytes in 0 blocks ==25130== Rerun with --leak-check=full to see details of leaked memory ==25130== ==25130== For counts of detected and suppressed errors, rerun with: -v ==25130== Use --track-origins=yes to see where uninitialised values come from ==25130== ERROR SUMMARY: 8951807 errors from 4 contexts (suppressed: 0 from 0) Segmentation fault (core dumped) The first three warnings may or may not be significant; I don't know. The actual segfaulting address is the second instruction of its function, which starts like this: Dump of assembler code for function camlOutput__fun_2066: 0x08058ae0 <+0>: sub $0x10,%esp 0x08058ae3 <+3>: mov 0x4(%ebx),%ecx I don't know how to debug this any further. This is a problem because coq has broken dependencies due to the OCaml upgrade, and I have several other packages with broken dependencies that cannot be rebuilt until coq is. Sorry for the large broken deps text at the top of the Rawhide reports. -- Jerry James http://www.jamezone.org/ -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel