On Thu, 2020-03-05 at 19:33 +0100, Frederic Weisbecker wrote: > On Wed, Mar 04, 2020 at 04:07:12PM +0000, Alex Belits wrote: > > > > Hi Alew, > > I'm glad this patchset is being resurected. > Reading that changelog, I like the general idea and the direction. > The diff is a bit scary though but I'll check the patches in detail > in the upcoming days. > I made some updates -- added missing code for arm and x86, restored sign-off lines and updated commit messages. This is the result of some work that mostly happened on earlier versions and had to deal with the fact that timers and housekeeping work often appeared on all CPUs, so some solutions may look like an overkill. Nevertheless it was very helpful for finding the sources of unexpected disturbances. Also originally some of the race conditions and potential delayed work at the time when a task is entering isolated state were considered unavoidable. So the part in kernel was focused on correctness of handling those conditions, while detection and dealing with their consequences was done in userspace (in libtmc). Now it looks like there may be much fewer such situations, however I am still not very thrilled with the idea of complicating the kernel more than we have to. Especially when it comes to code that is relevant only over few seconds when the task is starting and entering isolated mode. So I have to admit that some solutions look like "more EINTR than EINTR", and I still like them more than making kernel side of entering/exiting isolation even more complex than it is now. I may be wrong, and there may be some more elegant solution, however I don't see it now. Userspace-assisted isolation entering/exiting procedure worked very well in a system with a huge number of cores, threads, drivers with unusual features, etc., so at very least we have some usable reference point. > > In a number of cases we can tell on a remote cpu that we are > > going to be interrupting the cpu, e.g. via an IPI or a TLB flush. > > In that case we generate the diagnostic (and optional stack dump) > > on the remote core to be able to deliver better diagnostics. > > If the interrupt is not something caught by Linux (e.g. a > > hypervisor interrupt) we can also request a reschedule IPI to > > be sent to the remote core so it can be sure to generate a > > signal to notify the process. > > I'm wondering if it's wise to run that on a guest at all :-) > Or we should consider any guest exit to the host as a > disturbance, we would then need some sort of paravirt > driver to notify that, etc... That doesn't sound appealing. Why not? I am not a big fan of virtualization, however people seem to use it for all kinds of purposes now, and we only have to propagate (or reject) isolation requests from guest to host (as long as resource and permissions policy allow that). For KVM it would be literally replicating guest task isolation state on the host, and as long as CPU core is isolated, does it really matter if the task was created with two layers of virtualization instead of one? For isolation to make sense, it's still code running on a CPU with fixed address mapping. If this is still the case, virtualization only determines what can be in that space, not how it behaves. If this is not the case, and task causes kernel code to run, be it guest or host kernel, then something is wrong, and isolation is broken. Not very different from behavior without virtualization. This would be very bad for early days of virtualization when very little could be done by a guest without host messing with it. Now, when pieces of hardware can be (relatively) safely given to the guest userspace to work on, we can just as well let it run isolated. > > Thanks. Thanks! -- Alex