> On Tue, Jan 31, 2023 at 11:31:28AM +0000, Reshetova, Elena wrote: > > > On Mon, 2023-01-30 at 07:42 +0000, Reshetova, Elena wrote: > > > [...] > > > > > The big threat from most devices (including the thunderbolt > > > > > classes) is that they can DMA all over memory. However, this isn't > > > > > really a threat in CC (well until PCI becomes able to do encrypted > > > > > DMA) because the device has specific unencrypted buffers set aside > > > > > for the expected DMA. If it writes outside that CC integrity will > > > > > detect it and if it reads outside that it gets unintelligible > > > > > ciphertext. So we're left with the device trying to trick secrets > > > > > out of us by returning unexpected data. > > > > > > > > Yes, by supplying the input that hasn’t been expected. This is > > > > exactly the case we were trying to fix here for example: > > > > https://lore.kernel.org/all/20230119170633.40944-2- > > > alexander.shishkin@xxxxxxxxxxxxxxx/ > > > > I do agree that this case is less severe when others where memory > > > > corruption/buffer overrun can happen, like here: > > > > https://lore.kernel.org/all/20230119135721.83345-6- > > > alexander.shishkin@xxxxxxxxxxxxxxx/ > > > > But we are trying to fix all issues we see now (prioritizing the > > > > second ones though). > > > > > > I don't see how MSI table sizing is a bug in the category we've > > > defined. The very text of the changelog says "resulting in a kernel > > > page fault in pci_write_msg_msix()." which is a crash, which I thought > > > we were agreeing was out of scope for CC attacks? > > > > As I said this is an example of a crash and on the first look > > might not lead to the exploitable condition (albeit attackers are creative). > > But we noticed this one while fuzzing and it was common enough > > that prevented fuzzer going deeper into the virtio devices driver fuzzing. > > The core PCI/MSI doesn’t seem to have that many easily triggerable > > Other examples in virtio patchset are more severe. > > > > > > > > > > > > > > > If I set this as the problem, verifying device correct operation is > > > > > a possible solution (albeit hugely expensive) but there are likely > > > > > many other cheaper ways to defeat or detect a device trying to > > > > > trick us into revealing something. > > > > > > > > What do you have in mind here for the actual devices we need to > > > > enable for CC cases? > > > > > > Well, the most dangerous devices seem to be the virtio set a CC system > > > will rely on to boot up. After that, there are other ways (like SPDM) > > > to verify a real PCI device is on the other end of the transaction. > > > > Yes, it the future, but not yet. Other vendors will not necessary be > > using virtio devices at this point, so we will have non-virtio and not > > CC enabled devices that we want to securely add to the guest. > > > > > > > > > We have been using here a combination of extensive fuzzing and static > > > > code analysis. > > > > > > by fuzzing, I assume you mean fuzzing from the PCI configuration space? > > > Firstly I'm not so sure how useful a tool fuzzing is if we take Oopses > > > off the table because fuzzing primarily triggers those > > > > If you enable memory sanitizers you can detect more server conditions like > > out of bounds accesses and such. I think given that we have a way to > > verify that fuzzing is reaching the code locations we want it to reach, it > > can be pretty effective method to find at least low-hanging bugs. And these > > will be the bugs that most of the attackers will go after at the first place. > > But of course it is not a formal verification of any kind. > > > > so its hard to > > > see what else it could detect given the signal will be smothered by > > > oopses and secondly I think the PCI interface is likely the wrong place > > > to begin and you should probably begin on the virtio bus and the > > > hypervisor generated configuration space. > > > > This is exactly what we do. We don’t fuzz from the PCI config space, > > we supply inputs from the host/vmm via the legitimate interfaces that it can > > inject them to the guest: whenever guest requests a pci config space > > (which is controlled by host/hypervisor as you said) read operation, > > it gets input injected by the kafl fuzzer. Same for other interfaces that > > are under control of host/VMM (MSRs, port IO, MMIO, anything that goes > > via #VE handler in our case). When it comes to virtio, we employ > > two different fuzzing techniques: directly injecting kafl fuzz input when > > virtio core or virtio drivers gets the data received from the host > > (via injecting input in functions virtio16/32/64_to_cpu and others) and > > directly fuzzing DMA memory pages using kfx fuzzer. > > More information can be found in https://intel.github.io/ccc-linux-guest- > hardening-docs/tdx-guest-hardening.html#td-guest-fuzzing > > > > Best Regards, > > Elena. > > Hi Elena, Hi Jeremi, > > I think it might be a good idea to narrow down a configuration that *can* > reasonably be hardened to be suitable for confidential computing, before > proceeding with fuzzing. Eg. a lot of time was spent discussing PCI devices > in the context of virtualization, but what about taking PCI out of scope > completely by switching to virtio-mmio devices? I agree that narrowing down is important and we spent a significant effort in disabling various code we don’t need (including PCI code, like quirks, early PCI, etc). The decision to use virtio over pci vs. mmio I believe comes from performance and usage scenarios and we have to best we can with these limitations. Moreover, even if we could remove PCI for the virtio devices by removing the transport dependency, this isn’t possible for other devices that we know are used in some CC setups: not all CSPs are using virtio-based drivers, so pretty quickly PCI comes back into hardening scope and we cannot just remove it unfortunately. Best Regards, Elena.