On Thu, 2020-10-01 at 15:56 +0200, Frederic Weisbecker wrote: > External Email > > ------------------------------------------------------------------- > --- > On Wed, Jul 22, 2020 at 02:49:49PM +0000, Alex Belits wrote: > > +/* > > + * Description of the last two tasks that ran isolated on a given > > CPU. > > + * This is intended only for messages about isolation breaking. We > > + * don't want any references to actual task while accessing this > > from > > + * CPU that caused isolation breaking -- we know nothing about > > timing > > + * and don't want to use locking or RCU. > > + */ > > +struct isol_task_desc { > > + atomic_t curr_index; > > + atomic_t curr_index_wr; > > + bool warned[2]; > > + pid_t pid[2]; > > + pid_t tgid[2]; > > + char comm[2][TASK_COMM_LEN]; > > +}; > > +static DEFINE_PER_CPU(struct isol_task_desc, isol_task_descs); > > So that's quite a huge patch that would have needed to be split up. > Especially this tracing engine. > > Speaking of which, I agree with Thomas that it's unnecessary. It's > too much > code and complexity. We can use the existing trace events and perform > the > analysis from userspace to find the source of the disturbance. The idea behind this is that isolation breaking events are supposed to be known to the applications while applications run normally, and they should not require any analysis or human intervention to be handled. A process may exit isolation because some leftover delayed work, for example, a timer or a workqueue, is still present on a CPU, or because a page fault or some other exception, normally handled silently, is caused by the task. It is also possible to direct an interrupt to a CPU that is running an isolated task -- currently it's perfectly valid to set interrupt smp affinity to a CPU running isolated task, and then interrupt will cause breaking isolation. While it's probably not the best way of handling interrupts, I would rather not prohibit this explicitly. There is also a matter of avoiding race conditions on entering isolation. Once CPU entered isolation, other CPUs should avoid disturbing it when they know that CPU is running a task in isolated mode. However for a short time after entering isolation other CPUs may be unaware of this, and will still send IPIs to it. Preventing this scenario completely would be very costly in terms of what other CPUs will have to do before notifying others, so similar to how EINTR works, we can simply specify that this is allowed, and task is supposed to re- enter isolation after this. It's still a bad idea to specify that isolation breaking can continue happening while application is running in isolated mode, however allowing some "grace period" after entering is acceptable as long as application is aware of this happening. In libtmc I have moved this handling of isolation breaking into a separate thread, intended to become a separate daemon if necessary. In part it was done because initial implementation of isolation made it very difficult to avoid repeating delayed work on isolated CPUs, so something had to watch for it from non-isolated CPU. It's possible that now, when delayed work does not appear on isolated CPUs out of nowhere, the need in isolation manager thread will disappear, and task itself will be able to handle all isolation breaking, like original implementation by Chris was supposed to. However in either case it's still useful for the task, or isolation manager, to get a description of the isolation-breaking event. This is what those things are intended for. Now they only produce log messages because this is where initially all description of isolation-breaking events went, however I would prefer to make logging optional but always let applications read those events descriptions, regardless of any tracing mechanism being used. I was more focused on making the reporting mechanism properly detect the cause of isolation breaking because that functionality was not quite working in earlier work by Chris and Yuri, so I have kept logging as the only output, but made it suitable for producing events that applications will be able to receive. Application, or isolation manager, will receive clear and unambiguous reporting, so there will be no need for any additional analysis or guesswork. After adding a proper "low-level" isolation flags, I got the idea that we might have a better yet reporting mechanism. Early isolation breaking detection on kernel entry may set a flag that says that isolation breaking happened, however its cause is unknown. Or, more likely, only some general information about isolation breaking is available, like a type of exception. Then, once a known isolation- breaking reporting mechanism is called from interrupt, syscall, IPI or exception processing, the flag is cleared, and reporting is supposed to be done. However if then kernel returns to userspace on isolated task but isolation breaking is not reported yet, an isolation breaking reporting with "unknown cause" will happen. We may even add some more optional lightweight tracing for debugging purposes, however the fact that reporting will be done, will allow us to make sure that no matter how complicated exception processing is, or how we managed to miss some subtle details of an architecture where we are implementing task isolation, there will be a reliable way to tell the user that something is wrong, and tell the task that there is something it has to react to. > > Thanks. >