Hi,
recently Ruby 3.3 was released, we have noticed a failure to build on
COPR's aarch64:
https://download.copr.fedorainfracloud.org/results/jackorp/ruby-builds/fedora-rawhide-aarch64/06848355-ruby/
https://download.copr.fedorainfracloud.org/results/jackorp/ruby-builds/fedora-rawhide-aarch64/06848355-ruby/builder-live.log.gz
But we do not observe these failures on koji (see e.g.
https://koji.fedoraproject.org/koji/taskinfo?taskID=111230891 )
What i have observed is that the hw_info.log reports different flags,
visually I'd say koji has half the CPU flags, despite koji reporting to be
the equal CPU model Neoverse-N1 of the vendor ID of ARM as does copr report.
More details regarding the failures:
According to upstream bug report [0] the culprit is change introducing
PAC/BTI support in some arm64 assembly [1] and the fix
to no longer have Ruby segfault is including
`ASFLAGS=-mbranch-protection=pac-ret` [2] in addition to the same flag
in XCFLAGS.
This spawns a few questions for me:
1. Since [1] the `-mbranch-protection=pac-ret` is needed in both CFLAGS
and ASFLAGS, I am unsure how it interacts with the Fedora defaults,
I see default CFLAGS contain `-mbranch-protection=standard` and the flag
with pac-ret seems to be appended to libruby.so in the case of the
upstream fix [2].
From what I understand, it shouldn't cause problems to have these 2
flags at the same time on the correct compilation artifacts, is that
correct?
2. Since files compiled with `-mbranch-protection=pac-ret` seem to end
up in the .so library and Ruby binary extensions link against that solib,
do the binary extensions also have to be compiled with that exact option?
3. If we do not fix this bug in Ruby 3.3.0 but wait with this for 3.3.1
where the fix will most probably land, will we by effect exclude a
subset of ARM CPUs,
that actually have the PAC capability, for that in-between period?
4. Why do koji and copr have CPU flag set that differs so much? Is our
koji infra OK?
5. Why does it fail on copr and does not fail on koji? It seems the
paca/pacg have to be present and set on the CPU flags for the segfaults
to occur.
I tried answering the last question when reading on that in kernel docs
[3], but I can't say I understand the text 100%.
Thanks,
Jarek Prokop
[0] https://bugs.ruby-lang.org/issues/20085
[1] https://github.com/ruby/ruby/pull/9306
[2] https://github.com/ruby/ruby/pull/9371
[3] https://www.kernel.org/doc/html/v6.4/arm64/pointer-authentication.html
--
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue