Re: Three steps we could take to make supply chain attacks a bit harder

Scott Schmit <i.grok@xxxxxxxxxxx> · Mon, 1 Apr 2024 00:31:28 -0400

On Mon, Apr 01, 2024 at 09:06:16AM +0900, Dominique Martinet wrote:
> Scott Schmit wrote on Sun, Mar 31, 2024 at 05:02:44PM -0400:
> > Deleting the tests makes no sense to me either, but it seems like a
> > mechanism that ensures the test code can't change the build outputs (or
> > a mechanism to detect that it's happened and abort the build) would
> > allow upstream tests to be run without compromising the integrity of the
> > build itself.
> 
> Just to be clear here that wouldn't have been enough: it's not the test
> step that's modifying the binaries, the actual build step is modified in
> the right conditions to use data that looks like it belongs to a test
> (I've read the actual files aren't actually used in any test and just
> look like test data, I didn't check, it wouldn't be hard to make a test
> that uses them anyway)
> 
> So short of deleting all blobs e.g. all test data this wouldn't have
> been prevented, just not running tests isn't enough.

Ugh!  I'd missed that detail.

> In theory it'd be possible to build twice:
> - one normal build with test data, and run tests at the end
> - a second build without test data (and confirm we've got an identical
> binary, builds are reproducible right?!)

FWIW, I don't think you'd need to build twice:

One approach:
1. do the build
2. do the install
3. generate the RPMs
4. quarantine the RPMs so they're safe from modification
   - I believe this could be done via SELinux policy
   - there are probably other mechanisms
5. run the tests
   - for SELinux, this might be via an `rpmbuild-test` binary that
     doesn't have rights to touch the output RPMs
6a. if the tests fail, destroy the RPMs and fail out, reproducing the
   result today
6b. if the tests pass, move/copy the RPMs to the result location and
   exit cleanly, reproducing the result today

A variation of this would order things as normal but capture hashes of
all the files before the tests start and check the hashes after they're
done.  As long as the hashes can't be changed in between, you can trust
that.

Another approach (perhaps more doable in mock, not so easy otherwise):
1. do the build as user1
2. do the tests as user2
3. if the tests pass, resume as user1 as today

Given that the build pollution came from loading data out of the tests,
the above might still be useful, but it's not sufficient to prevent a
recurrence. (Though if you can identify the test files, you could still
segregate them via similar means -- make them unreadable until you're
ready to run the tests.)

> But while we might be able to afford the computing cost, I'm not sure
> it's worth it -- this attack vector happened to use test data, but there
> are plenty of other ways of doing this, and even just identifying /
> removing test data in the first place is hard work for packagers (I
> guess we could try to detect binary files but there is no end to that
> either, and many builds would just break if we were to automatically
> remove tests...)

Another idea I've seen floated
<https://chaos.social/@ollibaba/112189619904512612> is to modify the
build tools (compilers, etc) to generate SBOM files that log every input
file used to produce a given output.  Then you have a record to audit to
see if stuff is pulled from weird places.  That said, that wouldn't help
either, because all the machinations were via shell commands that
aren't, strictly speaking, build tools.  (Except if the linker shows a
file as input that was never generated by a compiler…)

But we could adapt that idea -- couldn't we use strace(1)/ptrace(2) to
detect open calls to determine the inputs and outputs?  And we could
protect the resulting records from build/test tampering similarly to the
above.

And if the upstream makes it difficult to tell what's test vs not…I
think at this point we have this incident as precedent to help explain
why we'd prefer less convoluted builds, or at least some explanations.

I dunno how workable that is in practice, though.  Though a part of me
says that it all boils down to making the build process more transparent
than we've previously put up with.

Anyway, just putting that out there in case anyone thinks these are good
ideas (or brainstorming toward better ones).

> (Anyway, I also think tests bring more benefits than risks in our case)

And it also occurs to me that test failures could just as easily help
find a malicious change if some new feature we're adding to Fedora
breaks the tests unexpectedly and exposes it (or just something unique
about how we've configured things).

-- 
Scott Schmit
<<attachment: smime.p7s>>
--
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue