Re: Build failure on the Reef release

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Wed, Dec 6, 2023 at 4:06 AM Yong Yuan <yycslab@xxxxxxxxx> wrote:
Thanks for all the help, Kefu. It looks like running ninja multiple times with different parallelism, possibly along with previous runs having failed or being killed, may cause failures? 

probably. but i don't understand how GNU as 2.30 was used if you ran "do_cmake.sh" and "ninja" in the bash launched by "scl enable gcc-toolset-11 bash". maybe you somehow ran ninja in the bash which was not launched by "scl enable gcc-toolset-11 bash" ? you could have verify this by running "as" just like you ran "gcc -v".
 
I created a new docker instance and repeated the same build steps, while limiting the ninja parallelism to 8, and the build was successful. As for the default parallel jobs, I can't find how to print out the default parallelism of the ninja command, but per the source code of ninja, the command should have 10 parallel jobs as my container used 8 cores.  Below are the steps I used: 

-j10 with 32GB available memory should not cause OOM, i think. but if you run into this again, probably you could check dmesg and use "top" to check the memory consumption stats of the compiling processes, so we can adjust the heuristic settings in cmake/modules/LimitJobs.cmake. 
 
1. docker run  --name ceph-dev -it quay.io/centos/centos:stream8 /bin/bash
2. yum install git
3. git clone https://github.com/ceph/ceph 
4. cd ceph
5. ./install-deps.sh
6. scl enable gcc-toolset-11 bash
7. gcc -v # verify that we are on gcc11
8. ./do_cmake.sh
9. cd build
10. ninja -j 8 # Initially I used only ninja, which would fail due to insufficient memory. 
 
this is funny. it seems that the gcc driver was feeding the assembler with --gdwarf-4, and the GNU as 2.30 does not understand this option. but i believe GNU as 2.36 does. GNU as 2.30 is shipped along with el8, while GNU as 2.36 is packaged as a part of gcc-toolset-11. not sure how you set up the building environment. but i'd suggest follow its instructions, and use something like
source scl_source enable gcc-toolset-11
or 
scl enable gcc-toolset-11 bash.

I did run `scl enable gcc-toolset-11 bash` after running `./install-deps.sh`. The output of gcc -v. 

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/rh/gcc-toolset-11/root/usr/libexec/gcc/x86_64-redhat-linux/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/gcc-toolset-11/root/usr --mandir=/opt/rh/gcc-toolset-11/root/usr/share/man --infodir=/opt/rh/gcc-toolset-11/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-11.2.1-20220127/obj-x86_64-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.1 20220127 (Red Hat 11.2.1-9) (GCC)
 
next time, if you have further questions. please share your command line and the steps to reproduce the build failure. I still don't know how many jobs you were using before you reduced it to 8.

Will do.  The initial command was just "ninja". I can't find how to print out the default parallelism of the ninja command, but per the source code of ninja, the command should have 10 parallel jobs as my container used 8 cores.  As for the steps, I simply followed the Ceph's readme: 
1. docker run  --name ceph-dev -it quay.io/centos/centos:stream8 /bin/bash
2. yum install git
3. git clone https://github.com/ceph/ceph 
4. cd ceph
5. ./install-deps.sh
6. scl enable gcc-toolset-11 bash
7. gcc -v # verify that we are on gcc11
8. ./do_cmake.sh
9. cd build
10. ninja 

thank you, Yong. IMHO, the workflow is flawless =)

just hoped that we can learn something and improve our process from the failures, or improve the building script.. but nothing suspicious so far.
 

On Tue, Dec 5, 2023 at 7:32 AM kefu chai <tchaikov@xxxxxxxxx> wrote:


On Tue, Dec 5, 2023 at 4:26 AM Yong Yuan <yycslab@xxxxxxxxx> wrote:
I added the "-v" flag to the setup.py that calls the compiler. The compiler did search the path /ceph/src/include for the header and the verbose output does not include any error message on missing header files. The compiler exception did not include more details other than "command '/opt/rh/gcc-toolset-11/root/usr/bin/gcc' failed with exit status 1". I also manually ran the compiler command per the log successfully. There was no compilation error or broken pipe. Anything else I can do to figure out what went wrong? Thanks, 

Manually rerunning the failed step: 
/opt/rh/gcc-toolset-11/root/usr/libexec/gcc/x86_64-redhat-linux/11/cc1 -quiet -v -D DYNAMIC_ANNOTATIONS_ENABLED=1 -D NDEBUG -D _GNU_SOURCE -D void0="dead_function(void)" -D "__Pyx_check_single_interpreter(ARG)"="ARG##0" -D _FILE_OFFSET_BITS=64 -iquote /ceph/src/include -iquote /ceph/src/pybind/cephfs/../../include -D_FORTIFY_SOURCE=2 -D_GLIBCXX_ASSERTIONS /ceph/src/pybind/cephfs/tmpmpszwqum/cephfs_dummy.c -quiet -dumpdir /ceph/src/pybind/cephfs/tmpmpszwqum/ceph/src/pybind/cephfs/tmpmpszwqum/ -dumpbase cephfs_dummy.c -dumpbase-ext .c -m64 -mtune=generic -march=x86-64 -g -grecord-gcc-switches -O2 -Wall -Werror=format-security -w -version -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection=full -fwrapv -fPIC -o - | as -v -W --gdwarf-4 --64 -o /ceph/src/pybind/cephfs/tmpmpszwqum/ceph/src/pybind/cephfs/tmpmpszwqum/cephfs_dummy.o
GNU assembler version 2.30 (x86_64-redhat-linux) using BFD version version 2.30-123.el8
as: unrecognized option '--gdwarf-4'

this is funny. it seems that the gcc driver was feeding the assembler with --gdwarf-4, and the GNU as 2.30 does not understand this option. but i believe GNU as 2.36 does. GNU as 2.30 is shipped along with el8, while GNU as 2.36 is packaged as a part of gcc-toolset-11. not sure how you set up the building environment. but i'd suggest follow its instructions, and use something like

source scl_source enable gcc-toolset-11

or 

scl enable gcc-toolset-11 bash.

next time, if you have further questions. please share your command line and the steps to reproduce the build failure. I still don't know how many jobs you were using before you reduced it to 8.

 
GNU C17 (GCC) version 11.2.1 20220127 (Red Hat 11.2.1-9) (x86_64-redhat-linux)
compiled by GNU C version 11.2.1 20220127 (Red Hat 11.2.1-9), GMP version 6.1.2, MPFR version 3.1.6-p2, MPC version 1.1.0, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/opt/rh/gcc-toolset-11/root/usr/lib/gcc/x86_64-redhat-linux/11/include-fixed"
ignoring nonexistent directory "/opt/rh/gcc-toolset-11/root/usr/lib/gcc/x86_64-redhat-linux/11/../../../../x86_64-redhat-linux/include"
ignoring duplicate directory "/ceph/src/pybind/cephfs/../../include"
#include "..." search starts here:
 /ceph/src/include
#include <...> search starts here:
 /opt/rh/gcc-toolset-11/root/usr/lib/gcc/x86_64-redhat-linux/11/include
 /usr/local/include
 /opt/rh/gcc-toolset-11/root/usr/include
 /usr/include
End of search list.
GNU C17 (GCC) version 11.2.1 20220127 (Red Hat 11.2.1-9) (x86_64-redhat-linux)
compiled by GNU C version 11.2.1 20220127 (Red Hat 11.2.1-9), GMP version 6.1.2, MPFR version 3.1.6-p2, MPC version 1.1.0, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: a3c58fc8debcd373de63dc0d26f2c032


Running ninja with verbose output: 

[root@bf1a404c20a3 build]# ninja -v -j 1 | tee verbose_output.log
[1/560] cd /ceph/src/pybind/cephfs && env CC="/opt/rh/gcc-toolset-11/root/usr/bin/gcc" CFLAGS="" CPPFLAGS="-iquote/ceph/src/include -w -D'void0=dead_function(void)' -D'__Pyx_check_single_interpreter(ARG)=ARG##0'" CXX="/opt/rh/gcc-toolset-11/root/usr/bin/g++" LDSHARED="/opt/rh/gcc-toolset-11/root/usr/bin/gcc -shared" OPT="-DNDEBUG -g -fwrapv -O2 -w" LDFLAGS=-L/ceph/build/lib CYTHON_BUILD_DIR=/ceph/build/src/pybind/cephfs CEPH_LIBDIR=/ceph/build/lib /usr/bin/python3.6 /ceph/src/pybind/cephfs/setup.py build --build-base /ceph/build/lib/cython_modules --build-platlib /ceph/build/lib/cython_modules/lib.3
FAILED: lib/cython_modules/lib.3/cephfs.cpython-36m-x86_64-linux-gnu.so
cd /ceph/src/pybind/cephfs && env CC="/opt/rh/gcc-toolset-11/root/usr/bin/gcc" CFLAGS="" CPPFLAGS="-iquote/ceph/src/include -w -D'void0=dead_function(void)' -D'__Pyx_check_single_interpreter(ARG)=ARG##0'" CXX="/opt/rh/gcc-toolset-11/root/usr/bin/g++" LDSHARED="/opt/rh/gcc-toolset-11/root/usr/bin/gcc -shared" OPT="-DNDEBUG -g -fwrapv -O2 -w" LDFLAGS=-L/ceph/build/lib CYTHON_BUILD_DIR=/ceph/build/src/pybind/cephfs CEPH_LIBDIR=/ceph/build/lib /usr/bin/python3.6 /ceph/src/pybind/cephfs/setup.py build --build-base /ceph/build/lib/cython_modules --build-platlib /ceph/build/lib/cython_modules/lib.3
Using built-in specs.
COLLECT_GCC=/opt/rh/gcc-toolset-11/root/usr/bin/gcc
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/gcc-toolset-11/root/usr --mandir=/opt/rh/gcc-toolset-11/root/usr/share/man --infodir=/opt/rh/gcc-toolset-11/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-11.2.1-20220127/obj-x86_64-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.1 20220127 (Red Hat 11.2.1-9) (GCC)
COLLECT_GCC_OPTIONS='-D' 'DYNAMIC_ANNOTATIONS_ENABLED=1' '-D' 'NDEBUG' '-O2' '-g' '-pipe' '-Wall' '-Werror=format-security' '-fexceptions' '-fstack-protector-strong' '-grecord-gcc-switches' '-m64' '-mtune=generic' '-fasynchronous-unwind-tables' '-fstack-clash-protection' '-fcf-protection=full' '-D' '_GNU_SOURCE' '-fwrapv' '-iquote' '/ceph/src/include' '-w' '-D' 'void0=dead_function(void)' '-D' '__Pyx_check_single_interpreter(ARG)=ARG##0' '-fPIC' '-v' '-iquote' '/ceph/src/pybind/cephfs/../../include' '-D' '_FILE_OFFSET_BITS=64' '-c' '-o' '/ceph/src/pybind/cephfs/tmp8l9cti0z/ceph/src/pybind/cephfs/tmp8l9cti0z/cephfs_dummy.o' '-march=x86-64' '-dumpdir' '/ceph/src/pybind/cephfs/tmp8l9cti0z/ceph/src/pybind/cephfs/tmp8l9cti0z/'
 /opt/rh/gcc-toolset-11/root/usr/libexec/gcc/x86_64-redhat-linux/11/cc1 -quiet -v -D DYNAMIC_ANNOTATIONS_ENABLED=1 -D NDEBUG -D _GNU_SOURCE -D void0=dead_function(void) -D __Pyx_check_single_interpreter(ARG)=ARG##0 -D _FILE_OFFSET_BITS=64 -iquote /ceph/src/include -iquote /ceph/src/pybind/cephfs/../../include -D_FORTIFY_SOURCE=2 -D_GLIBCXX_ASSERTIONS /ceph/src/pybind/cephfs/tmp8l9cti0z/cephfs_dummy.c -quiet -dumpdir /ceph/src/pybind/cephfs/tmp8l9cti0z/ceph/src/pybind/cephfs/tmp8l9cti0z/ -dumpbase cephfs_dummy.c -dumpbase-ext .c -m64 -mtune=generic -march=x86-64 -g -grecord-gcc-switches -O2 -Wall -Werror=format-security -w -version -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection=full -fwrapv -fPIC -o - |
 as -v -W --gdwarf-4 --64 -o /ceph/src/pybind/cephfs/tmp8l9cti0z/ceph/src/pybind/cephfs/tmp8l9cti0z/cephfs_dummy.o
GNU assembler version 2.30 (x86_64-redhat-linux) using BFD version version 2.30-123.el8
GNU C17 (GCC) version 11.2.1 20220127 (Red Hat 11.2.1-9) (x86_64-redhat-linux)
compiled by GNU C version 11.2.1 20220127 (Red Hat 11.2.1-9), as: unrecognized option '--gdwarf-4'
GMP version 6.1.2, MPFR version 3.1.6-p2, MPC version 1.1.0, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/opt/rh/gcc-toolset-11/root/usr/lib/gcc/x86_64-redhat-linux/11/include-fixed"
ignoring nonexistent directory "/opt/rh/gcc-toolset-11/root/usr/lib/gcc/x86_64-redhat-linux/11/../../../../x86_64-redhat-linux/include"
ignoring duplicate directory "/ceph/src/pybind/cephfs/../../include"
#include "..." search starts here:
 /ceph/src/include
#include <...> search starts here:
 /opt/rh/gcc-toolset-11/root/usr/lib/gcc/x86_64-redhat-linux/11/include
 /usr/local/include
 /opt/rh/gcc-toolset-11/root/usr/include
 /usr/include
End of search list.
GNU C17 (GCC) version 11.2.1 20220127 (Red Hat 11.2.1-9) (x86_64-redhat-linux)
compiled by GNU C version 11.2.1 20220127 (Red Hat 11.2.1-9), GMP version 6.1.2, MPFR version 3.1.6-p2, MPC version 1.1.0, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: a3c58fc8debcd373de63dc0d26f2c032
/ceph/src/pybind/cephfs/tmp8l9cti0z/cephfs_dummy.c:9:1: fatal error: error writing to -: Broken pipe
    9 | }
      | ^
compilation terminated.

Compile Error: Ceph FS development headers not found
the tmp_file:  /ceph/src/pybind/cephfs/tmp8l9cti0z/cephfs_dummy.c Ceph source:  /ceph/src/pybind/cephfs/../..
The exception:  command '/opt/rh/gcc-toolset-11/root/usr/bin/gcc' failed with exit status 1
ninja: build stopped: subcommand failed.

On Mon, Dec 4, 2023 at 12:33 AM Yong Yuan <yycslab@xxxxxxxxx> wrote:
Thanks, Kefu! I had bumped the memory of my container to 32GB in anticipation of the need for more memory, but I guess it was not enough. My machine has 8 cores, 64GB of physical memory, and had more than 36GB of available memory when I kicked off the build. I reduced the number of parallel jobs to 8 and reran the ninja job. This time the job failed for the errors below. It looks like the header file cephfs/libcephfs.h was missing? The file ./src/include/cephfs/libcephfs.h does exist, though. 

[root@bf1a404c20a3 build]# ninja -v -j 8
[1/567] cd /ceph/src/pybind/cephfs && env CC="/opt/rh/gcc-toolset-11/root/usr/bin/gcc" CFLAGS="" CPPFLAGS="-iquote/ceph/src/include -w -D'void0=dead_function(void)' -D'__Pyx_check_single_interpreter(ARG)=ARG##0'" CXX="/opt/rh/gcc-toolset-11/root/usr/bin/g++" LDSHARED="/opt/rh/gcc-toolset-11/root/usr/bin/gcc -shared" OPT="-DNDEBUG -g -fwrapv -O2 -w" LDFLAGS=-L/ceph/build/lib CYTHON_BUILD_DIR=/ceph/build/src/pybind/cephfs CEPH_LIBDIR=/ceph/build/lib /usr/bin/python3.6 /ceph/src/pybind/cephfs/setup.py build --build-base /ceph/build/lib/cython_modules --build-platlib /ceph/build/lib/cython_modules/lib.3
FAILED: lib/cython_modules/lib.3/cephfs.cpython-36m-x86_64-linux-gnu.so
cd /ceph/src/pybind/cephfs && env CC="/opt/rh/gcc-toolset-11/root/usr/bin/gcc" CFLAGS="" CPPFLAGS="-iquote/ceph/src/include -w -D'void0=dead_function(void)' -D'__Pyx_check_single_interpreter(ARG)=ARG##0'" CXX="/opt/rh/gcc-toolset-11/root/usr/bin/g++" LDSHARED="/opt/rh/gcc-toolset-11/root/usr/bin/gcc -shared" OPT="-DNDEBUG -g -fwrapv -O2 -w" LDFLAGS=-L/ceph/build/lib CYTHON_BUILD_DIR=/ceph/build/src/pybind/cephfs CEPH_LIBDIR=/ceph/build/lib /usr/bin/python3.6 /ceph/src/pybind/cephfs/setup.py build --build-base /ceph/build/lib/cython_modules --build-platlib /ceph/build/lib/cython_modules/lib.3
as: unrecognized option '--gdwarf-4'
/ceph/src/pybind/cephfs/tmpog18e3jq/cephfs_dummy.c:9:1: fatal error: error writing to -: Broken pipe
    9 | }
      | ^
compilation terminated.

Compile Error: Ceph FS development headers not found
ninja: build stopped: subcommand failed.


On Sun, Dec 3, 2023 at 8:00 PM kefu chai <tchaikov@xxxxxxxxx> wrote:



On Mon, Dec 4, 2023 at 9:29 AM Yong Yuan <yycslab@xxxxxxxxx> wrote:
My bad. I'm still developing the right mental model for C++'s or Ceph's build toolchain. Please see attached for the entire output of the run with ninja -v. The file dot_ninja.log.gzip is the copy of the file .ninja_log in the same build directory. In case it helps,  I used the container quay.io/centos/centos:stream8, and below is the os configuration of the container instance. Thanks,  

i am quoting the command line which failed:

[755/2207] /usr/bin/ccache /opt/rh/gcc-toolset-11/root/usr/bin/g++ -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_ASIO_HAS_IO_URING -DBOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT -DHAVE_CONFIG_H -DTEST_LIBRBD_INTERNALS -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D_REENTRANT -D_THREAD_SAFE -D__CEPH__ -D__STDC_FORMAT_MACROS -D__linux__ -Isrc/include -I../src -isystem boost/include -isystem include -isystem ../src/xxHash -isystem ../src/fmt/include -isystem ../src/googletest/googlemock/include -isystem ../src/googletest/googlemock -isystem ../src/googletest/googletest/include -isystem ../src/googletest/googletest -isystem ../src/jaegertracing/opentelemetry-cpp/api/include -isystem ../src/jaegertracing/opentelemetry-cpp/exporters/jaeger/include -isystem ../src/jaegertracing/opentelemetry-cpp/ext/include -isystem ../src/jaegertracing/opentelemetry-cpp/sdk/include -Og -g -fPIE -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free -DBOOST_PHOENIX_STL_TUPLE_H_ -Wall -fno-strict-aliasing -fsigned-char -Wtype-limits -Wignored-qualifiers -Wpointer-arith -Werror=format-security -Winit-self -Wno-unknown-pragmas -Wnon-virtual-dtor -Wno-ignored-qualifiers -ftemplate-depth-1024 -Wpessimizing-move -Wredundant-move -Wstrict-null-sentinel -Woverloaded-virtual -DCEPH_DEBUG_MUTEX -fstack-protector-strong -D_GLIBCXX_ASSERTIONS -fdiagnostics-color=auto -std=c++2a -MD -MT src/test/librbd/CMakeFiles/unittest_librbd.dir/migration/test_mock_HttpClient.cc.o -MF src/test/librbd/CMakeFiles/unittest_librbd.dir/migration/test_mock_HttpClient.cc.o.d -o src/test/librbd/CMakeFiles/unittest_librbd.dir/migration/test_mock_HttpClient.cc.o -c ../src/test/librbd/migration/test_mock_HttpClient.cc
FAILED: src/test/librbd/CMakeFiles/unittest_librbd.dir/migration/test_mock_HttpClient.cc.o
/usr/bin/ccache /opt/rh/gcc-toolset-11/root/usr/bin/g++ -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_ASIO_HAS_IO_URING -DBOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT -DHAVE_CONFIG_H -DTEST_LIBRBD_INTERNALS -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D_REENTRANT -D_THREAD_SAFE -D__CEPH__ -D__STDC_FORMAT_MACROS -D__linux__ -Isrc/include -I../src -isystem boost/include -isystem include -isystem ../src/xxHash -isystem ../src/fmt/include -isystem ../src/googletest/googlemock/include -isystem ../src/googletest/googlemock -isystem ../src/googletest/googletest/include -isystem ../src/googletest/googletest -isystem ../src/jaegertracing/opentelemetry-cpp/api/include -isystem ../src/jaegertracing/opentelemetry-cpp/exporters/jaeger/include -isystem ../src/jaegertracing/opentelemetry-cpp/ext/include -isystem ../src/jaegertracing/opentelemetry-cpp/sdk/include -Og -g -fPIE -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free -DBOOST_PHOENIX_STL_TUPLE_H_ -Wall -fno-strict-aliasing -fsigned-char -Wtype-limits -Wignored-qualifiers -Wpointer-arith -Werror=format-security -Winit-self -Wno-unknown-pragmas -Wnon-virtual-dtor -Wno-ignored-qualifiers -ftemplate-depth-1024 -Wpessimizing-move -Wredundant-move -Wstrict-null-sentinel -Woverloaded-virtual -DCEPH_DEBUG_MUTEX -fstack-protector-strong -D_GLIBCXX_ASSERTIONS -fdiagnostics-color=auto -std=c++2a -MD -MT src/test/librbd/CMakeFiles/unittest_librbd.dir/migration/test_mock_HttpClient.cc.o -MF src/test/librbd/CMakeFiles/unittest_librbd.dir/migration/test_mock_HttpClient.cc.o.d -o src/test/librbd/CMakeFiles/unittest_librbd.dir/migration/test_mock_HttpClient.cc.o -c ../src/test/librbd/migration/test_mock_HttpClient.cc
g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.

this error is sitting right at the 17th line from the end of the file.

please check your building environment. my hunch is that your OOM killer kicked in and killed the compiler process with SIGTERM. if that's the case, probably you could lower the number of jobs, by passing "-j <num-of-jobs>" to ninja or to cmake. you could also help by providing more details on your environment, like the number of cores, the total physical memory, and the available memory when compiling ceph, etc. we try to figure out a reasonable level of parallelism in cmake/modules/LimitJobs.cmake based on previous experiments, but it might not work well under some settings.


[root@bf1a404c20a3 build]# cat /etc/os-release
NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"


On Sun, Dec 3, 2023 at 12:50 AM kefu chai <tchaikov@xxxxxxxxx> wrote:


Le dim. 3 déc. 2023 à 16:27, Yong Yuan <yycslab@xxxxxxxxx> a écrit :
Thanks, Kefu! When I ran ninja -v, I didn't see any other error message other than "ninja: build stopped: subcommand failed." The step and the output right before this message is below. The file .ninja_log does not contain any error message, either. I also reran the step below and it simply completed successfully. Any suggestion on what I can do to gather more detailed error messages? 

The snippet does not help. Again, please collect the *full* output of the run which failed, if the size of output is too large, you could use some pastebin service, or upload it somewhere so one can download.


[767/2207] /usr/bin/ccache /opt/rh/gcc-toolset-11/root/usr/bin/g++ -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_ASIO_HAS_IO_URING -DBOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT -DHAVE_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D_REENTRANT -D_THREAD_SAFE -D__CEPH__ -D__STDC_FORMAT_MACROS -D__linux__ -Isrc/include -I../src -isystem boost/include -isystem include -isystem ../src/xxHash -isystem ../src/fmt/include -isystem ../src/jaegertracing/opentelemetry-cpp/api/include -isystem ../src/jaegertracing/opentelemetry-cpp/exporters/jaeger/include -isystem ../src/jaegertracing/opentelemetry-cpp/ext/include -isystem ../src/jaegertracing/opentelemetry-cpp/sdk/include -Og -g -fPIC -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free -DBOOST_PHOENIX_STL_TUPLE_H_ -Wall -fno-strict-aliasing -fsigned-char -Wtype-limits -Wignored-qualifiers -Wpointer-arith -Werror=format-security -Winit-self -Wno-unknown-pragmas -Wnon-virtual-dtor -Wno-ignored-qualifiers -ftemplate-depth-1024 -Wpessimizing-move -Wredundant-move -Wstrict-null-sentinel -Woverloaded-virtual -DCEPH_DEBUG_MUTEX -fstack-protector-strong -D_GLIBCXX_ASSERTIONS -fdiagnostics-color=auto -std=c++2a -MD -MT src/mds/CMakeFiles/mds.dir/CInode.cc.o -MF src/mds/CMakeFiles/mds.dir/CInode.cc.o.d -o src/mds/CMakeFiles/mds.dir/CInode.cc.o -c ../src/mds/CInode.cc



On Sat, Dec 2, 2023 at 5:37 PM kefu chai <tchaikov@xxxxxxxxx> wrote:


On Sun, Dec 3, 2023 at 5:00 AM Yong Yuan <yycslab@xxxxxxxxx> wrote:
Hi, 

I'm trying to build a DEBUG version of Ceph's reef-release branch on a virtual Ubuntu-LTS 22.04 running on Lima by following the README on Ceph's github repo. The build failed and the last CMake error was ""g++-11: error: unrecognized command-line option '-Wimplicit-const-int-float-conversion'". Could anyone please help me figure out I can do to track down and fix the compilation error? Below are more details. 


The system configuration is as follows: 

> uname -a
Linux lima-ceph-dev 5.15.0-86-generic #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux


> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy


I followed the instructions on the README in Ceph's github repo: https://github.com/ceph/ceph, and the command ./do_cmake.sh failed at step [137/2150] that builds frontend dashboard with the error message "ninja: build stopped: subcommand failed." The last error logged in the file CMakeError.log has to do with "g++-11: error: unrecognized command-line option '-Wimplicit-const-int-float-conversion'". 

Below the last error message on the CMakeError.log: 

Performing C++ SOURCE FILE Test COMPILER_SUPPORTS_WARN_IMPLICIT_CONST_INT_FLOAT_CONVERSION failed with the following output:
Change Dir: /home/dyuan.linux/ceph/build/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/ninja cmTC_bab6d && [1/2] Building CXX object CMakeFiles/cmTC_bab6d.dir/src.cxx.o
FAILED: CMakeFiles/cmTC_bab6d.dir/src.cxx.o
/usr/bin/g++-11 -DCOMPILER_SUPPORTS_WARN_IMPLICIT_CONST_INT_FLOAT_CONVERSION  -fPIE   -Wimplicit-const-int-float-conversion -std=c++20 -o CMakeFiles/cmTC_bab6d.dir/src.cxx.o -c /home/dyuan.linux/ceph/build/CMakeFiles/CMakeTmp/src.cxx
g++-11: error: unrecognized command-line option '-Wimplicit-const-int-float-conversion'
ninja: build stopped: subcommand failed.


I think the error from CMake is not fatal. What it implies is that the compiler does not support this warning option. see the change which introduced this check: https://github.com/ceph/ceph/commit/658ecaec8ebade868799f2535bd195853f07f7f9 .  

if the build fails, the root cause could be in the cmake script which generates the building system. but the error message does not prove this hypothesis. please paste the full error from ninja, not the one in the CMakeError.log. 


I also tried to build ceph on the docker quay.io/centos/centos and got the same g++ failure. 
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx


--
Regards
Kefu Chai


--
Regards
Kefu Chai


--
Regards
Kefu Chai


--
Regards
Kefu Chai
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux