Jonathan Cameron wrote: > > > >>> - tdi_info - read measurements/certs/interface report; > > >> > > >> Does this return cached cert chains and measurements from the device > > >> or does it retrieve them anew? (Measurements might have changed if > > >> MEAS_FRESH_CAP is supported.) > > >> > > >> > > >>> If the user wants only CMA/SPDM, the Lukas'es patched will do that without > > >>> the PSP. This may co-exist with the AMD PSP (if the endpoint allows multiple > > >>> sessions). > > >> > > >> It can co-exist if the pci_cma_claim_ownership() library call > > >> provided by patch 12/12 is invoked upon device_connect. > > >> > > >> It would seem advantageous if you could delay device_connect > > >> until a device is actually passed through. Then the OS can > > >> initially authenticate and measure devices and the PSP takes > > >> over when needed. > > > > > > Would that delay mean IDE isn't up - I think that wants to be > > > available whether or not pass through is going on. > > > > > > Given potential restrictions on IDE resources, I'd expect to see an explicit > > > opt in from userspace on the host to start that process for a given > > > device. (udev rule or similar might kick it off for simple setups). > > > > > > Would that work for the flows described? > > > > This would work but my (likely wrong) intention was also to run > > necessary setup in both host and guest at the same time before drivers > > probe devices. And while delaying it in the host is fine (well, for us > > in AMD, as we are aiming for CoCo/TDISP), in the guest this means less > > flexibility in enlightening the PCI subsystem and the guest driver: > > ideally (or at least initially) the driver is supposed to probe already > > enabled and verified device, as otherwise it has to do SWIOTLB until the > > userspace does the verification and kicks the driver to go proper direct > > DMA (or reload the driver?). > > In the case of a guest getting a VF, there probably won't be any way for > the kernel to run any native attestation anyway, so policy would have to > rely on the CoCo paths. Kernel stuff Lukas has would just not try to attest > or claim anything about it. If a VF has a CMA capable DOE instance > then that's not there for IDE stuff at all, but for the guest to get > direct measurements etc without PSP or anything else getting involved > in which case the guest using that directly is a reasonable thing to do. Is that a practical reality that VFs are going to implement CMA? My expectation is CMA is a PF facility and the TSM retrieves measurements for TDIs through that. At least that seems to be fundamental assumption of the TDISP specification. Given config-cycles are always host-mediated I expect guest CMA will always be a proxy whether it is is per-VF CMA interface or not. > > > > > > Next bit probably has holes... Key is that a lot of the checks > > > may fail, and it's up to host userspace policy to decide whether > > > to proceed (other policy in the secure VM side of things obviously) > > > > > > So my rough thinking is - for the two options (IDE / TDISP) > > > > > > Comparing with Alexey's flow I think only real difference is that > > > I call out explicit host userspace policy controls. I'd also like > > > > My imagination fails me :) What is the host supposed to do if the device > > verification fails/succeeds later, and how much later, and the device is > > a boot disk? Or is this userspace going to be limited to initramdisk? > > What is that thing which we are protecting against? Or it is for CUDA > > and such (which yeah, it can wait)? > > There are a bunch of non obvious cases indeed. Hence make it all policy. > Though if you have a flow where verification is needed for boot disk and > it fails (and policy says that's not acceptable) then bad luck you > probably need to squirt a cert into your ramdisk or UEFI or similar. It seems policy mechanisms should be incrementally added as clear need for policy dictates, because that has ABI implications and kernel-depedency-on-userpace expectations. > > > to use similar interfaces to convey state to host userspace as > > > per Lukas' existing approaches. Sure there will also be in > > > kernel interfaces for driver to get data if it knows what to do > > > with it. I'd also like to enable the non tdisp flow to handle > > > IDE setup 'natively' if that's possible on particular hardware. > > > > > > 1. Host has a go at CMA/SPDM. Policy might say that a failure here is > > > a failure in general so reject device - or it might decide it's up to > > > the PSP etc. (userspace can see if it succeeded) > > > I'd argue host software can launch this at any time. It will > > > be a denial of service attack but so are many other things the host > > > can do. > > > > Trying to visualize it in my head - policy is a kernel cmdline or module > > parameter? > > Neither - it's bind not happening until userspace decides to kick it off. > The module could provide it's own policy on top of this - so userspace > could defer to that if it makes sense (so bind but rely on probe failing > if policy not met). udev module policy can already gate binding, its not clear new policy mechanism is needed here. > > > > > > 2. TDISP policy decision from host (userspace policy control) > > > Need to know end goal. > > > > /sys/bus/pci/devices/0000:11:22.3/tdisp ? > > Maybe - I'm sure we'll bikeshed anything like that :) > > > > > > 3. IDE opt in from userspace. Policy decision. > > > - If not TDISP > > > - device_connect(IDE ONLY) - bunch of proxying in host OS. > > > - Cert chain and measurements presented to host, host can then check if > > > it is happy and expose for next policy decision. > > > - Hooks exposed for host to request more measurements, key refresh etc. > > > Idea being that the flow is host driven with PSP providing required > > > services. If host can just do setup directly that's fine too. > > > > I'd expect the user to want IDE on from the very beginning, why wait to > > turn it on later? The question is rather if the user wants to panic() or > > warn() or block the device if IDE setup failed. > > There are some concerns about being able to support enough selective IDE streams. > Might turn out to be a false concern (I've not yet got visibility of enough > implementations to be able to tell). > Also (as I understand it as a software guy) IDE has a significant performance > and power cost (and for CXL at least there are various trade offs and options > you can enable depending on security model and device features). > > There is "talk" of people turning IDE off if they can cope without it and only > enabling for CoCo (and possibly selectively doing that as well) Agree, IDE stream resource allocation is something an admin needs to be able to reason about. > > > - If TDISP (technically you can run tdisp from host, but lets assume > > > for now no one wants to do that? (yet)). > > > - device_connect(TDISP) - bunch of proxying in host OS. > > > - Cert chain and measurements presented to host, host can then check if > > > it is happy and expose for next policy decision. > > > > On AMD SEV TIO the TDISP setup happens in "tdi_bind" when the device is > > about to be passed through which is when QEMU (==userspace) starts. > Ah. Ok. > > > > > > > > > 4. Flow after this depends on early or late binding (lockdown) > > > but could load driver at this point. Userspace policy. > > > tdi-bind etc. > > > > Not sure I follow this. A host or guest driver? > > Hmm - I confess I'm confusing myself now. > > At this stage we just have enough info to load a driver for the PF because > to get to state we want locked prior to VF assignment the PF driver may > have some configuration to do. > > If all that goes well and the TDI can be moved to locked state, and assigned > to a TVM which then has to decide to issue tdi_validate before binding > the guest driver (which I assume is the TDISP START_INTERFACE_REQUEST > bit of the state machine). Locked before assignment is valid, but lock after assignment (upon guest wanting to transition a TDI or recover a TDI after an error) is likely one of the first incremental features to add after a baseline is established. > Or is the guest driver ever needed before this > transition? (I see you called it out as not, but is it always a one time > thing on driver load or can that decision change without unbind/bind > of driver?) For whole device passthrough it may be the case that the guest needs to do some operations with the device in shared mode before taking it private. > I know this gets more complex for the PF pass through cases where the > driver needs to load and do some setup before you can lock down the device > but do people have that requirement for VFs? If they do it feels like > device was designed wrong to me... Agree, because in the full device passthrough case I expect the PF driver to just be generic vfio-pci. > Too many specs (some of which provide too many ways you 'could' do it) > so I may well have a bunch of this wrong :( This is why I think we pick one painfully simple use case to enable first and then incrementally build on it with concrete rationales.