Re: [RFC] tentative prctl task isolation interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 14, 2021 at 09:22:54AM +0000, Christoph Lameter wrote:
> On Wed, 13 Jan 2021, Marcelo Tosatti wrote:
> 
> > So as discussed, this is one possible prctl interface for
> > task isolation.
> >
> > Is this something that is desired? If not, what is the
> > proper way for the interface to be?
> 
> Sure that sounds liek a good beginning but I guess we need some
> specificity on the features
> 
> > +Task isolation CPU interface
> > +============================
> 
> How does one do a oneshot flush of OS activities?

        ret = prctl(PR_TASK_ISOLATION_REQUEST, ISOL_F_QUIESCE, 0, 0, 0);
        if (ret == -1) {
                perror("prctl PR_TASK_ISOLATION_REQUEST");
                exit(0);
        }

> 
> I.e. I have a polling loop over numerous shared and I/o devices in user
> space and I want to make sure that the system is quite before I enter the
> loop. 

You could configure things in two ways: with syscalls allowed or not. 

Syscalls disallowed:
===================

1) Add a new isolation feature ISOL_F_BLOCK_SYSCALLS (to block certain
syscalls) along with ISOL_F_SETUP_NOTIF (to notify upon isolation
breaking):

        if ((ifeat & ISOL_F_BLOCK_SYSCALLS) == ISOL_F_BLOCK_SYSCALLS) {
		struct task_isolation_block_syscalls tibs = { list of
							 syscalls to block,
							 additional
							 parameters }

		struct task_isolation_notif tis = { parameters to control
						signal handling upon
						isolation breaking event }
		
                ret = prctl(PR_TASK_ISOLATION_SET, ISOL_F_SETUP_NOTIF, &tis);
		if (ret != 0) { ... }
		featuremask |= ISOL_F_SETUP_NOTIF;

                ret = prctl(PR_TASK_ISOLATION_SET, ISOL_F_BLOCK_SYSCALLS, &tibs);
		if (ret != 0) { ... }
		featuremask |= ISOL_F_BLOCK_SIGNALS;

                featuremask |= ISOL_F_QUIESCE;
        }

This would require knowledge of the behaviour of individual system
calls, that is whether or not these syscalls cause the CPU to be a target
of interruptions (1) (while the QUIESCE / HARD / WARN division you propose 
allows for coarse-grained control).

Perhaps coarse control while also allowing finer grained control 
(if desired) is a useful choice?

1: for example adding free pages to per-cpu free lists.

Syscalls allowed:
=================

> In the loop itself some activities may require syscalls so they will
> potentialy cause the OS services such as timers to start again.

Or a different mode where the syscall return itself can finish
any pending activities.

> When such
> an activities is complete another quiet down call can be issued.

Although this seems more efficient (if multiple syscalls are to be
used).

> Could be implemented by setting a flag that does an action and then resets
> itself?  Or the flag could be reset if a syscall that requires timers etc
> is used?

You mean to let userspace know if a certain syscall triggered a pending
action which must be finished (before "quiet mode" is entered again) ?
Sounds like a good idea.

> Features that I think may be needed:
> 
> F_ISOL_QUIESCE		-> quiet down now but allow all OS activities. OS
> 			activites reset flag
> 
> F_ISOL_BAREMETAL_HARD	-> No OS interruptions. Fault on syscalls that
> 			require such actions in the future.

Question: why BAREMETAL ?

Two comments:

1) HARD mode could also block activities from different CPUs that can 
interrupt this isolated CPU (for example CPU hotplug, or increasing 
per-CPU trace buffer size).

Unclear whether such blockage should be performed on:

-> Individual action basis (eg: BLOCK_CPU_HOTPLUG,
BLOCK_PERCPU_TRACEBUFFER_SIZE, ...) (which could allow
individual unblocking through a sysfs interface, for example).

Or

-> Be tied to a flag with a less implementation specific meaning such as
F_ISOL_BAREMETAL_HARD.

2) For a type of application it is the case that certain interruptions
can be tolerated, as long as they do not cross certain thresholds.
For example, one loses the flexibility to read/write MSRs 
on the isolated CPUs (including performance counters,
RDT/MBM type MSRs, frequency/power statistics) by 
forcing a "no interruptions" mode.

That flexibility seems to be useful (so perhaps 
F_ISOL_BAREMETAL_HARD but optionally permitting 
certain interruptions).

> F_ISOL_BAREMETAL_WARN	-> Similar. Create a warning in the syslog when OS
> 				services require delayed processing etc
> 				but continue while resetting the flag.

Alex seems to be interested in different notification methods as well.

Thanks for the input.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux