On Tue, 12 Mar 2024 at 15:48, Petr Tesařík <petr@xxxxxxxxxxx> wrote: > > - How we’ve solved the TLB flushing issues in sensitivity tracking, and > > how it could be done better. > > Hello and welcome! I ran into a similar challenge with SandBox Mode. My > solution was to run sandbox code with CPL=3 (on x86) and control page > access with the U/S PTE bit rather than the P bit, which allowed me to > implement lazy TLB invalidation. The x86 folks didn't like idea... Hmm, a similar idea might be to use protection keys. I'm not sure if that really works though, we haven't given it any serious thought, since not all CPUs support it. So that would be something to explore as a later optimisation rather than a basic principle. > For the record, SandBox Mode was designed with confidentiality in mind, > although the initial patch series left out this part for simplicity. I > wonder if your objective is to protect kernel data from user space, or > if you have also considered decomposing the kernel into components that > are isolated from each other (and then it we could potentially find > some synergies). Yeah that's something we've pondered. What I've presented here is definitely about protecting the kernel from userspace/VM guest but it's a framework where you could conceivably isolate all sorts of things. Maybe there's a world where ASI makes unprivileged BPF a more viable notion. The thing is, what I'm presenting here doesn't protect against software bugs at all - if you can get the kernel to architecturally access data and do something bad with it, ASI will happily remap that data and branch back to the buggy code. That probably simplifies things quite a lot as compared to SBM. But yes, the whole "sensitivity tracking" thing does seem to share requirements with SandBox Mode, I will need to ponder this some more.