On 04/01, Al Viro wrote: > > On Wed, Apr 01, 2009 at 01:28:01AM +0100, Hugh Dickins wrote: > > > Otherwise it looks good to me, except I keep worrying about those > > EAGAINs. > > Frankly, -EAGAIN in situation when we have userland race is fine. And > we *do* have a userland race here - execve() will kill -9 those threads > in case of success, so if they'd been doing something useful, they are > about to be suddenly screwed. Can't resist! I dislike the "in_exec && -EAGAIN" oddity too. Yes sure, we can't break the "well written" applications. But imho this looks strange. And a bit "assymetrical" wrt LSM_UNSAFE_SHARE, I mean check_unsafe_exec() allows sub-threads to race or CLONE_FS but only if LSM_UNSAFE_SHARE. Another reason, we can have the "my test-case found something strange" bug-reports. So. Please feel free to nack or just ignore this message, but since I personally dislike the current behaviour I should at least try to suggest something else. - add "wait_queue_head_t in_exec_wait" to "struct linux_binprm". - kill fs->in_exec, add "wait_queue_head_t *in_exec_wait_ptr" instead. - introduce the new helper, void fs_lock_check_exec(struct fs_struct *fs) { write_lock(&fs->lock); while (unlikely(fs->in_exec_wait_ptr)) { DECLARE_WAITQUEUE(wait, current); if (fatal_signal_pending(current)) /* * clone/exec can't succeed, and this * thread won't return to the user-space */ break; __add_wait_queue(fs->in_exec_wait_ptr, &wait); __set_current_state(TASK_KILLABLE); write_unlock(&fs->lock); schedule(); write_lock(&fs->lock); __remove_wait_queue(&wait); } } Or we can return -EANYTHING when fatal_signal_pending(), this doesn't matter. Note that this helper can block only if we race with our sub-thread in the middle of !LSM_UNSAFE_SHARE exec. Otherwise this is not slower than write_lock(fs->lock) + if (fs->in_exec) we currently have. - change copy_fs() to do if (clone_flags & CLONE_FS) { fs_lock_check_exec(fs); fs->users++; write_unlock(&fs->lock); return 0; } - change check_unsafe_exec: void check_unsafe_exec(struct linux_binprm *bprm) { struct task_struct *p = current, *t; unsigned n_fs; bprm->unsafe = tracehook_unsafe_exec(p); n_fs = 1; fs_lock_check_exec(&p->fs); if (p->fs->in_exec_wait_ptr) /* we are going to die */ goto out; rcu_read_lock(); for (t = next_thread(p); t != p; t = next_thread(t)) { if (t->fs == p->fs) n_fs++; } rcu_read_unlock(); if (p->fs->users > n_fs) { bprm->unsafe |= LSM_UNSAFE_SHARE; } else { bprm->unsafe |= __LSM_EXEC_WAKE; init_waitqueue_head(&bprm->in_exec_wait); p->fs->in_exec_wait_ptr = &bprm->in_exec_wait; } out: write_unlock(&p->fs->lock); } - and, finally, change do_execve() /* execve succeeded */ current->fs->in_exec_wait_ptr = NULL; ... out_unmark: if (bprm->unsafe & __LSM_EXEC_WAKE) { write_lock(¤t->fs->lock); current->fs->in_exec_wait_ptr = NULL; wake_up_locked(&bprm->in_exec_wait); write_unlock(¤t->fs->lock); } Comments? Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html