On Thu, 2021-05-06 at 18:45 +0200, David Hildenbrand wrote: > On 06.05.21 17:26, James Bottomley wrote: > > On Wed, 2021-05-05 at 12:08 -0700, Andrew Morton wrote: > > > On Wed, 3 Mar 2021 18:22:00 +0200 Mike Rapoport <rppt@xxxxxxxxxx > > > > > > > wrote: > > > > > > > This is an implementation of "secret" mappings backed by a file > > > > descriptor. > > > > > > > > The file descriptor backing secret memory mappings is created > > > > using a dedicated memfd_secret system call The desired > > > > protection mode for the memory is configured using flags > > > > parameter of the system call. The mmap() of the file descriptor > > > > created with memfd_secret() will create a "secret" memory > > > > mapping. The pages in that mapping will be marked as not > > > > present in the direct map and will be present only in the page > > > > table of the owning mm. > > > > > > > > Although normally Linux userspace mappings are protected from > > > > other users, such secret mappings are useful for environments > > > > where a hostile tenant is trying to trick the kernel into > > > > giving them access to other tenants mappings. > > > > > > I continue to struggle with this and I don't recall seeing much > > > enthusiasm from others. Perhaps we're all missing the value > > > point and some additional selling is needed. > > > > > > Am I correct in understanding that the overall direction here is > > > to protect keys (and perhaps other things) from kernel > > > bugs? That if the kernel was bug-free then there would be no > > > need for this feature? If so, that's a bit sad. But realistic I > > > guess. > > > > Secret memory really serves several purposes. The "increase the > > level of difficulty of secret exfiltration" you describe. And, as > > you say, if the kernel were bug free this wouldn't be necessary. > > > > But also: > > > > 1. Memory safety for use space code. Once the secret memory is > > allocated, the user can't accidentally pass it into the > > kernel to be > > transmitted somewhere. > > That's an interesting point I didn't realize so far. > > > 2. It also serves as a basis for context protection of virtual > > machines, but other groups are working on this aspect, and > > it is > > broadly similar to the secret exfiltration from the kernel > > problem. > > > > I was wondering if this also helps against CPU microcode issues like > spectre and friends. It can for VMs, but not really for the user space secret memory use cases ... the in-kernel mitigations already present are much more effective. > > > > Is this intended to protect keys/etc after the attacker has > > > gained the ability to run arbitrary kernel-mode code? If so, > > > that seems optimistic, doesn't it? > > > > Not exactly: there are many types of kernel attack, but mostly the > > attacker either manages to effect a privilege escalation to root or > > gets the ability to run a ROP gadget. The object of this code is > > to be completely secure against root trying to extract the secret > > (some what similar to the lockdown idea), thus defeating privilege > > escalation and to provide "sufficient" protection against ROP > > gadget. > > What stops "root" from mapping /dev/mem and reading that memory? /dev/mem uses the direct map for the copy at least for read/write, so it gets a fault in the same way root trying to use ptrace does. I think we've protected mmap, but Mike would know that better than I. > IOW, would we want to enforce "CONFIG_STRICT_DEVMEM" with > CONFIG_SECRETMEM? Unless there's a corner case I haven't thought of, I don't think it adds much. However, doing a full lockdown on a public system where users want to use secret memory is best practice I think (except I think you want it to be the full secure boot lockdown to close all the root holes). > Also, there is a way to still read that memory when root by > > 1. Having kdump active (which would often be the case, but maybe not > to dump user pages ) > 2. Triggering a kernel crash (easy via proc as root) > 3. Waiting for the reboot after kump() created the dump and then > reading the content from disk. Anything that can leave physical memory intact but boot to a kernel where the missing direct map entry is restored could theoretically extract the secret. However, it's not exactly going to be a stealthy extraction ... > Or, as an attacker, load a custom kexec() kernel and read memory > from the new environment. Of course, the latter two are advanced > mechanisms, but they are possible when root. We might be able to > mitigate, for example, by zeroing out secretmem pages before booting > into the kexec kernel, if we care :) I think we could handle it by marking the region, yes, and a zero on shutdown might be useful ... it would prevent all warm reboot type attacks. James