Re: Change soft-dirty interface?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Pavel,

On Fri, Jun 14, 2013 at 02:01:23PM +0400, Pavel Emelyanov wrote:
> >>>>> If it's not allowed, another approach should be new system call.
> >>>>>
> >>>>>         int sys_softdirty(pid_t pid, void *addr, size_t len);
> >>>>
> >>>> This looks like existing sys_madvise() one.
> >>>
> >>> Except pid part. It is added by your purpose, which external task
> >>> can control any process.
> 
> In CRIU we can work with pid-less syscalls just fine :) So extending regular
> madvise would work.

I didn't know that.
Just out of curiosity. How can CRIU control other tasks without pid?

> 
> >>>>
> >>>>> If we approach new system call, we don't need to maintain current
> >>>>> proc interface and it would be very handy to get a information
> >>>>> without pagemap (open/read/close) so we can add a parameter to
> >>>>> get a dirty information easily.
> >>>>>
> >>>>>         int sys_softdirty(pid_t pid, void *addr, size_t len, unsigned char *vec)
> >>>>>
> >>>>> What do you think about it?
> >>>>>
> >>>>
> >>>> This is OK for me, though there's another issue with this API I'd like
> >>>> to mention -- consider your app is doing these tricks with soft-dirty
> >>>> and at the same time CRIU tools live-migrate it using the soft-dirty bits
> >>>> to optimize the freeze time.
> >>>>
> >>>> In that case soft-dirty bits would be in wrong state for both -- you app
> >>>> and CRIU, but with the proc API we could compare the ctime-s of the 
> >>>> clear_refs file and find out, that someone spoiled the soft-dirty state
> >>>> from last time we messed with it and handle it somehow (copy all the memory
> >>>> in the worst case). Can we somehow handle this with your proposal?
> >>>
> >>> Good point I didn't think over that.
> >>> A simple idea popped from my mind is we can use read/write lock
> >>> so if pid is equal to calling process's one or pid is NULL,
> >>> we use read side lock, which can allow marking soft-dirty 
> >>> several vmas with parallel. And pid is not equal to calling
> >>> process's one, the API should try to hold write-side lock
> >>> then, if it's fail, the API should return EAGAIN so that CRIU
> >>> can progress other processes and retry it after a while.
> >>> Of course, it would make live-lock so that sys_softdirty might
> >>> need another argument like "int block".
> >>
> >> And we need a flag to show SELF_SOFT_DIRTY or EXTERNAL_SOFT_DIRTY
> >> and the flag will be protected by above lock. It could prevent mixed
> >> case by self and external.
> > 
> > I realized it's not enough. Another idea is here.
> > The intenion is followin as,
> > 
> > self softdirty VS self softdirty -> NOT exclusive
> > self softdirty VS external softdirty -> exclusive
> > external softdirty VS external softdirty-> excluisve
> 
> I think it might work for us. However, I have two comments to the
> implementation, please see below.
> 
> > struct softdirty token {
> >         u64 external;
> >         u64 internal;
> > };
> > 
> >        int sys_set_softdirty(pid_t pid, unsigned long start, size_t len,
> >                                 struct softdirty *token); 

I should have mentioned that start and len are ignored if pid is not eqaul
to caller's pid.

> >        int sys_get_softdirty(pid_t pid, unsigned long start, size_t len, 
> >                                 struct softdirty token, char *vec);
> 
> Can you please show an example how to use these two, I don't quite get how
> can I do external soft-dirty tracking in atomic manner.

Hmm, I don't know how CRIU works but ...

	while(1) {

		struct softdirty token;
		
		sys_set_softdirty(tracked_pid, 0, 0, &token);
		...
		...
		...
		if (!sys_get_softdirty(tacked_pid, 0, 0, token, NULL))
			break;
	}

Maybe do you have a concern about live-lock?

> 
> > 
> > SYSCALL(set_softdirty, ..., token)
> > {
> >         struct task_struct *tsk = task_from_pid(pid);
> >         mutex_lock(&mm->st_lock);
> >         if (tsk == current)
> >                 tsk->mm->token.internal++; 
> >         else
> >                 tsk->mm->token.external++;
> >         token->external = mm->token.external;
> >         token->internal = mm->token.internal;
> >         mutex_unlock(&mm->st_lock);
> >         ..
> >         ..
> > 
> > }
> > 
> > SYSCALL(get_softdirty, ..., token, ...)
> > {
> >         struct task_struct *tsk = task_from_pid(pid);
> >         mutex_lock(&mm->st_lock);
> >         if (tsk == current) {
> >                 if (tsk->mm->token.external != token.external) {
> >                         mutex_unlock
> >                         return -EAGAIN;
> >                 }
> >         } else {
> >                 if (tsk->mm->token.external != token.external ||
> >                     tsk->mm->token.internal != token.internal) {
> >                         mutex_unlock;
> >                         return -EAGAIN;
> >                 }
> >         }
> >         mutex_unlock(&mm->st_lock);
> 
> Presumably the critical section should be longer, as if tokens match and we
> release the lock and proceed with working on pagemap, the concurrent call
> to set_softdirty can proceed and spoil the picture.

True.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]