Re: Runtime verification for instances

Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> · Tue, 11 Oct 2022 19:01:39 +0200

On 10/6/22 20:57, Joel Fernandes wrote:
> On Wed, Oct 5, 2022 at 12:07 PM Daniel Bristot de Oliveira
> <bristot@xxxxxxxxxx> wrote:
>>
>> On 10/3/22 22:35, Joel Fernandes wrote:
>>> Hi Daniel,
>>> I was thinking about your RV for a recent problem.
>>>
>>> Basically, we want to check for state transitions on a per-RCU
>>> callback basis. For example, we want to verify that for certain types
>>> of RCU callback, there is no chance of a scheduler wake up happening
>>> (provided such wake up happens for non-hardIRQ context).
>>
>> ack.
>>
>>>
>>> This means every RCU callback in flight needs its own automaton which
>>> is created when the callback is queued and destroyed when the callback
>>> is executed.
>>
>> do you have a "structure callback?" or could attach some extra info into it?
> 
> Unfortunately no, we have rcu_head but the idea is to do this without
> having to grow it.  In theory, you can do this with a hashtable
> maintained by RV and use the callback function as key, with value
> being automata.

right, a global hash table? per-cpu? per task?

>> There cannot be a single global automaton. And we have to
>>> make transitions for different automaton depending on their state.
>>>
>>> I bet you have already thought of something like this for tasks where
>>> you need a per-task automaton instead of a global one for some use
>>> case which requires tracking each task separately.
>>
>> Yep! we do have per task monitors (already).
> 
> Nice. This depends on growing task_struct though right?

We have it there already. If we need a per-task "aux" data we can justify adding
a pointer to a priv data. Note that I was already foreseeing this.

For example I will likely use it to monitor sched deadline tasks (I need
additional data to track the task from the monitor point of view).

Note that this will be available for *any* monitor so, not something
only for your use-case.

>>> We can implement this in RCU itself, however it will be ugly due to
>>> CONFIG options. And also we cannot do it easily in production. On the
>>> other hand, with RV, we could simply have RV watch the tracepoints and
>>> verify them. It will be clean.
>>
>> So, I am not sure if I got the problem here. So, you are saying that it will
>> be easier with RV. Right?
>>
>> If you need one monitor (automata) per task, we have support for it already.
>> If we need "another type of object," we can add.
> 
> Yeah I was hoping that with RV, you could already allocate a new
> automata dynamically or something for a new task/callback/whatever,
> and then drive that automata using tracepoints. So RV can be a nice
> place to do this kind of stuff, without adding debugging code to other
> subsystems.

We can make a dynamic as well, as long as you can allocate memory in the point where
you identify a new "instance," it would work.

Monitors are pretty self-contained, they can do their magic.

We can figure out a possible way.

>>> This is just 1 usecase (wakeup) , we also want to extend something
>>> like this to more things, like making sure callbacks execute in a time
>>> bounded fashion.
>>
>>
>> I know verifying temporal properties doesn't directly
>>> fall under RV,
>>
>> temporal or order or temporal of time? For temporal order yep we have it (I
>> se an overload of this term here... haha). For time, we can add additional
>> structure to transform them into logical events.
> 
> Yes, sorry I mean time. So time from when callback queued to when it
> is invoked falls within a bounded time period.
> 
> But more urgently what we need is:
> 
> For a certain type of callback that is invoked, check if the callback
> does a scheduler wake up. That is - doing a wake up when this type of
> callback is called is a BAD state.
> 
> This can be done purely using tracepoints:
> - We have to add a new tracepoint to call_rcu() for the new callback
> type (the type being Lazy).
> - We have to add a new tracepoint for when callback invocation ends
> (for invocation start, we already have trace_rcu_invoke_callback).
> 
> Then hook up RV to all of these, and verify that no scheduler_wakeup
> can happen during the invocation of a Lazy callback. But it has to be
> done on a per-callback basis.

So, let me see if I understood:

You have a Lazy callback, and you know it is lazy only at the call_rcu time.

Then when you start running the callback:
	no wakeup if it is of the type lazy
until you finish the execution?

> On the other hand, we can simplify it to a single automata by setting
> a per-cpu state for whenever a Lazy callback is invoked. But the
> problem is we have no idea that a callback is lazy when we are
> invoking it.

Wouldn't a global hash help for this?

per-cpu monitor + a global structure that is feed at call_rcu time?

> Thanks.
>