Re: Q: check_unsafe_exec() races (Was: [PATCH 2/4] fix setuid sometimes doesn't)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 6 Apr 2009, Oleg Nesterov wrote:
> On 04/01, Al Viro wrote:
> >
> > Rebased and pushed (same tree, same branch; included into for-next, along
> > with related cleanups).
> 
> Sorry for delay!

Please don't suppose that you can ever beat me at the slowness game!

> 
> Afaics, the usage of fs->in_exec is not completely right. But firstly, a
> couple of minor nits.
> 
> 
> check_unsafe_exec() doesn't need ->siglock, we can iterate over sub-threads
> under rcu_read_lock(). Note that with RCU or ->siglock we can set the "wrong"
> LSM_UNSAFE_SHARE if we race with copy_process(CLONE_THREAD | CLONE_FS), but
> as it was already discussed we don't care. This means it is OK to miss the
> freshly cloned thread which has already passed copy_fs().

Yes, I agree.
And preferable not to have IRQs disabled over that next_thread() loop.

> 
> 
> do_execve:
> 
> 	/* execve succeeded */
> 	write_lock(&current->fs->lock);
> 	current->fs->in_exec = 0;
> 	write_unlock(&current->fs->lock);
> 
> afaics, fs->lock is not needed. If ->in_exec was set, it was set by this
> thread-group and we do not share ->fs with another process. Since we are
> the only thread now, we can clear ->in_exec lockless.

Right, given your fix below.  I wondered for a moment if a barrier
would then be needed, but no, this is all racy (erring on the safe
side) if the userspace insists on being racy here.

> 
> 
> And now, what I think is wrong:
> 
> do_execve:
> 
> 	out_unmark:
> 		write_lock(&current->fs->lock);
> 		current->fs->in_exec = 0;
> 		write_unlock(&current->fs->lock);
> 
> Two threads T1 and T2 and another process P, all share the same ->fs.
> 
> T1 starts do_execve(BAD_FILE). It calls check_unsafe_exec(), since ->fs is
> shared, we set LSM_UNSAFE but not ->in_exec (actually, not very good name).
> 
> P exits and decrements fs->users.
> 
> T2 starts do_execve(), calls check_unsafe_exec(), now ->fs is not shared,
> we set fs->in_exec.
> 
> T1 continues, open_exec(BAD_FILE) fails, we clear ->in_exec and return
> to the user-space.
> 
> T1 does clone(CLONE_FS /* without CLONE_THREAD */).
> 
> T1 continues without LSM_UNSAFE_SHARE while ->fs is shared with another
> process.

If I follow you correctly, you meant to say T2 not T1 in the last step.

> 
> 
> What do you think about the (uncompiled) patch below ? It doesn't change
> compat_do_execve(), just for discussion.
> 
> But see also another message I am going to send...
> 
> Oleg.
> 
> do_execve() must not clear fs->in_exec if it was set by another thread,
> and we don't need fs->lock to clear.
> 
> Also, s/lock_task_sighand/rcu_read_lock/ in check_unsafe_exec().

Yes, I think your clear_in_exec change is a necessary one,
and your rcu_read_lock well worth while.

One tiny change (aside from extending to compat_do_execve):
Al originally had check_unsafe_exec()'s write_lock(&p->fs->lock)
after the lock_task_sighand(p, &flags), but was forced to invert
that by the IRQ issue lockdep flagged.  I think we'd all prefer
to think of fs->lock as an innermost lock, and would like it
now to go after your rcu_read_lock().

(You do rcu_read_unlock() earlier, but that's okay.)

Hugh

> 
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1060,7 +1060,6 @@ EXPORT_SYMBOL(install_exec_creds);
>  int check_unsafe_exec(struct linux_binprm *bprm)
>  {
>  	struct task_struct *p = current, *t;
> -	unsigned long flags;
>  	unsigned n_fs;
>  	int res = 0;
>  
> @@ -1068,11 +1067,12 @@ int check_unsafe_exec(struct linux_binpr
>  
>  	n_fs = 1;
>  	write_lock(&p->fs->lock);
> -	lock_task_sighand(p, &flags);
> +	rcu_read_lock();
>  	for (t = next_thread(p); t != p; t = next_thread(t)) {
>  		if (t->fs == p->fs)
>  			n_fs++;
>  	}
> +	rcu_read_unlock();
>  
>  	if (p->fs->users > n_fs) {
>  		bprm->unsafe |= LSM_UNSAFE_SHARE;
> @@ -1080,9 +1080,8 @@ int check_unsafe_exec(struct linux_binpr
>  		if (p->fs->in_exec)
>  			res = -EAGAIN;
>  		p->fs->in_exec = 1;
> +		res = 1;
>  	}
> -
> -	unlock_task_sighand(p, &flags);
>  	write_unlock(&p->fs->lock);
>  
>  	return res;
> @@ -1284,6 +1283,7 @@ int do_execve(char * filename,
>  	struct linux_binprm *bprm;
>  	struct file *file;
>  	struct files_struct *displaced;
> +	bool clear_in_exec;
>  	int retval;
>  
>  	retval = unshare_files(&displaced);
> @@ -1306,8 +1306,9 @@ int do_execve(char * filename,
>  		goto out_unlock;
>  
>  	retval = check_unsafe_exec(bprm);
> -	if (retval)
> +	if (retval < 0)
>  		goto out_unlock;
> +	clear_in_exec = retval;
>  
>  	file = open_exec(filename);
>  	retval = PTR_ERR(file);
> @@ -1355,9 +1356,7 @@ int do_execve(char * filename,
>  		goto out;
>  
>  	/* execve succeeded */
> -	write_lock(&current->fs->lock);
>  	current->fs->in_exec = 0;
> -	write_unlock(&current->fs->lock);
>  	current->in_execve = 0;
>  	mutex_unlock(&current->cred_exec_mutex);
>  	acct_update_integrals(current);
> @@ -1377,9 +1376,8 @@ out_file:
>  	}
>  
>  out_unmark:
> -	write_lock(&current->fs->lock);
> -	current->fs->in_exec = 0;
> -	write_unlock(&current->fs->lock);
> +	if (clear_in_exec)
> +		current->fs->in_exec = 0;
>  
>  out_unlock:
>  	current->in_execve = 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux