Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces

Christoffer Dall <christofferdall@xxxxxxxxxxxxxxxxxx> · Thu, 25 Mar 2010 11:34:16 +0100

On Thu, Mar 25, 2010 at 2:35 AM, Oren Laadan <orenl@xxxxxxxxxxxxxxx> wrote:
>
>
> Matt Helsley wrote:
>>
>> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
>>>
>>> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@xxxxxxxxxxxxxxx>
>>> wrote:
>>>>
>>>> Matt Helsley wrote:
>>>>>
>>>>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>>>>
>>>>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>>>>
>>>>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>>>>
>>>>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>>>>> making it possible for a debugger or checkpointing to gain
>>>>>>>>> information
>>>>>>>>> about another process' state with respect to system calls.
>>>>>>>>
>>>>>>>> I don't particularly like the idea that we always store the syscall
>>>>>>>> number to memory for every system call, whether the stored version
>>>>>>>> is
>>>>>>>> used or not.
>>>>>>>>
>>>>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>>>>> write-only variables can have a higher than expected expense.
>>>>>>>>
>>>>>>>> Is there not some thread flag which can be checked to see if we need
>>>>>>>> to
>>>>>>>> store the syscall number?
>>>>>>>
>>>>>>> Perhaps before we freeze the task we can save the syscall number on
>>>>>>> ARM.
>>>>>>> The patches suggest that the signal delivery path -- which the
>>>>>>> freezer
>>>>>>> utilizes -- has the syscall number already.
>>>>
>>>> Actually, the signal path doesn't have the syscall number, it has
>>>> a binary "in syscall" value.
>>>>
>>
>> Argh. I read too much into the name :(.
>>
>>> Well, this could be changed to pass the syscall number through
>>> registers along to try_to_freeze without any mentionable performance
>>> hit.
>>
>> Yes, that's possible. I was thinking we could still use your thread info
>> field but only store to it when we know it will be useful for c/r rather
>> than for each syscall. Personally, I'd rather avoid passing the extra
>> parameter into try_to_freeze(). Your idea below seems better to me.
>>
>>> Re-using the assembly code or factoring it out so that it can be used
>>> from multiple places doesn't seem very pleasing to me, as the assembly
>>> code is in the critical path and written specifically for the context
>>> of a process entering the kernel. Please correct me if I'm wrong.
>>>
>>> I imagine simply a function in C, more or less re-implementing the
>>> logic that's already in entry-common.S, might do the trick. I wouldn't
>>> worry much about the performance in this case as it will not be used
>>> often. The following _untested_ snippet illustrates my idea:
>>>
>>> ---
>>>  arch/arm/include/asm/syscall.h |   93
>>> +++++++++++++++++++++++++++++++++++++++-
>>>  1 files changed, 92 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/syscall.h
>>> b/arch/arm/include/asm/syscall.h
>>> index 3b3248f..a7f2615 100644
>>> --- a/arch/arm/include/asm/syscall.h
>>> +++ b/arch/arm/include/asm/syscall.h
>>> @@ -10,10 +10,101 @@
>>>  #ifndef _ASM_ARM_SYSCALLS_H
>>>  #define _ASM_ARM_SYSCALLS_H
>>>
>>> +static inline int get_swi_instruction(struct task_struct *task,
>>> +                                     struct pt_regs *regs,
>>> +                                     unsigned long *instr)
>>> +{
>>> +       struct page *page = NULL;
>>> +       unsigned long instr_addr;
>>> +       unsigned long *ptr;
>>> +       int ret;
>>> +
>>> +       instr_addr = regs->ARM_pc - 4;
>>> +
>>> +       down_read(&task->mm->mmap_sem);
>>> +       ret = get_user_pages(task, task->mm, instr_addr,
>>> +                            1, 0, 0, &page, NULL);
>>> +       up_read(&task->mm->mmap_sem);
>>> +
>>> +       if (ret < 0)
>>> +               return ret;
>>> +
>>> +       ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>>> +       memcpy(instr,
>>> +              ptr + (instr_addr >> PAGE_SHIFT),
>>
>>                        ^shouldn't this be:
>>                      instr_addr & PAGE_MASK
>>
>>> +              sizeof(unsigned long));
>>> +       kunmap_atomic(ptr, KM_USER1);
>>> +
>>> +       page_cache_release(page);
>>> +
>>> +       return 0;
>>> +}
>>
>> (again, not familiar with ARM so my understanding is:
>>
>> I guess swi is "syscall word immediate".
>>
>> The syscall nr is embedded in the instruction as an immediate
>> value and you're getting a copy of that instruction using the value of
>> the pc register just after the syscall instruction was executed.)
>>
>> Perhaps I am missing or forgetting something. Why isn't this as simple
>> as calling get_user() or even copy_from_user() using instr_addr?
>
> In c/r, we only need it at restart when a task calls it on itself.
>
> However the interface itself of get_syscall_nr() can be called by
> any task on another task.
>
> (In fact, I think that for the most part, saving the syscall number
> at checkpoint time may be better than figuring out at restart time).
>

So, as Oren is saying, the point was to make the syscall_get_nr(..)
work according to the interface specified in
include/asm-generic/syscall.h.

Considering it's unknown how we will deal with checkpoint/restart
across CONFIG_ARM_THUMB, CONFIG_OABI_COMPAT etc., I also think it's a
better idea to checkpoint the syscall number at checkpoint and for the
restore, place architecture specific hooks to get the syscall number
instead of calling syscall_get_nr(...) directly. In this way we should
always be able to get the syscall and correctly restart, independently
of what tricks we do to checkpoint restart across configuration
settings - if any.

Best,
Christoffer
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/containers