On Thu, Mar 25, 2010 at 2:35 AM, Oren Laadan <orenl@xxxxxxxxxxxxxxx> wrote: > > > Matt Helsley wrote: >> >> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote: >>> >>> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@xxxxxxxxxxxxxxx> >>> wrote: >>>> >>>> Matt Helsley wrote: >>>>> >>>>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote: >>>>>> >>>>>> On Tue, 23 Mar 2010, Matt Helsley wrote: >>>>>> >>>>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux >>>>>>> wrote: >>>>>>>> >>>>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote: >>>>>>>>> >>>>>>>>> This small commit introduces a global state of system calls for ARM >>>>>>>>> making it possible for a debugger or checkpointing to gain >>>>>>>>> information >>>>>>>>> about another process' state with respect to system calls. >>>>>>>> >>>>>>>> I don't particularly like the idea that we always store the syscall >>>>>>>> number to memory for every system call, whether the stored version >>>>>>>> is >>>>>>>> used or not. >>>>>>>> >>>>>>>> Since ARM caches are generally not write allocate, this means mostly >>>>>>>> write-only variables can have a higher than expected expense. >>>>>>>> >>>>>>>> Is there not some thread flag which can be checked to see if we need >>>>>>>> to >>>>>>>> store the syscall number? >>>>>>> >>>>>>> Perhaps before we freeze the task we can save the syscall number on >>>>>>> ARM. >>>>>>> The patches suggest that the signal delivery path -- which the >>>>>>> freezer >>>>>>> utilizes -- has the syscall number already. >>>> >>>> Actually, the signal path doesn't have the syscall number, it has >>>> a binary "in syscall" value. >>>> >> >> Argh. I read too much into the name :(. >> >>> Well, this could be changed to pass the syscall number through >>> registers along to try_to_freeze without any mentionable performance >>> hit. >> >> Yes, that's possible. I was thinking we could still use your thread info >> field but only store to it when we know it will be useful for c/r rather >> than for each syscall. Personally, I'd rather avoid passing the extra >> parameter into try_to_freeze(). Your idea below seems better to me. >> >>> Re-using the assembly code or factoring it out so that it can be used >>> from multiple places doesn't seem very pleasing to me, as the assembly >>> code is in the critical path and written specifically for the context >>> of a process entering the kernel. Please correct me if I'm wrong. >>> >>> I imagine simply a function in C, more or less re-implementing the >>> logic that's already in entry-common.S, might do the trick. I wouldn't >>> worry much about the performance in this case as it will not be used >>> often. The following _untested_ snippet illustrates my idea: >>> >>> --- >>> arch/arm/include/asm/syscall.h | 93 >>> +++++++++++++++++++++++++++++++++++++++- >>> 1 files changed, 92 insertions(+), 1 deletions(-) >>> >>> diff --git a/arch/arm/include/asm/syscall.h >>> b/arch/arm/include/asm/syscall.h >>> index 3b3248f..a7f2615 100644 >>> --- a/arch/arm/include/asm/syscall.h >>> +++ b/arch/arm/include/asm/syscall.h >>> @@ -10,10 +10,101 @@ >>> #ifndef _ASM_ARM_SYSCALLS_H >>> #define _ASM_ARM_SYSCALLS_H >>> >>> +static inline int get_swi_instruction(struct task_struct *task, >>> + struct pt_regs *regs, >>> + unsigned long *instr) >>> +{ >>> + struct page *page = NULL; >>> + unsigned long instr_addr; >>> + unsigned long *ptr; >>> + int ret; >>> + >>> + instr_addr = regs->ARM_pc - 4; >>> + >>> + down_read(&task->mm->mmap_sem); >>> + ret = get_user_pages(task, task->mm, instr_addr, >>> + 1, 0, 0, &page, NULL); >>> + up_read(&task->mm->mmap_sem); >>> + >>> + if (ret < 0) >>> + return ret; >>> + >>> + ptr = (unsigned long *)kmap_atomic(page, KM_USER1); >>> + memcpy(instr, >>> + ptr + (instr_addr >> PAGE_SHIFT), >> >> ^shouldn't this be: >> instr_addr & PAGE_MASK >> >>> + sizeof(unsigned long)); >>> + kunmap_atomic(ptr, KM_USER1); >>> + >>> + page_cache_release(page); >>> + >>> + return 0; >>> +} >> >> (again, not familiar with ARM so my understanding is: >> >> I guess swi is "syscall word immediate". >> >> The syscall nr is embedded in the instruction as an immediate >> value and you're getting a copy of that instruction using the value of >> the pc register just after the syscall instruction was executed.) >> >> Perhaps I am missing or forgetting something. Why isn't this as simple >> as calling get_user() or even copy_from_user() using instr_addr? > > In c/r, we only need it at restart when a task calls it on itself. > > However the interface itself of get_syscall_nr() can be called by > any task on another task. > > (In fact, I think that for the most part, saving the syscall number > at checkpoint time may be better than figuring out at restart time). > So, as Oren is saying, the point was to make the syscall_get_nr(..) work according to the interface specified in include/asm-generic/syscall.h. Considering it's unknown how we will deal with checkpoint/restart across CONFIG_ARM_THUMB, CONFIG_OABI_COMPAT etc., I also think it's a better idea to checkpoint the syscall number at checkpoint and for the restore, place architecture specific hooks to get the syscall number instead of calling syscall_get_nr(...) directly. In this way we should always be able to get the syscall and correctly restart, independently of what tricks we do to checkpoint restart across configuration settings - if any. Best, Christoffer _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers