Re: [PATCH 01/23] mm: Introduce revoke_file_mappings.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> writes:

> On Mon,  1 Jun 2009 14:50:26 -0700
> "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> wrote:
>
>> +static void revoke_vma(struct vm_area_struct *vma)
>
> This looks odd.
>
>> +{
>> +	struct file *file = vma->vm_file;
>> +	struct address_space *mapping = file->f_mapping;
>> +	unsigned long start_addr, end_addr, size;
>> +	struct mm_struct *mm;
>> +
>> +	start_addr = vma->vm_start;
>> +	end_addr = vma->vm_end;
>
> We take a copy of start_addr/end_addr (and this end_addr value is never used)
A foolish consistency.

>> +	/* Switch out the locks so I can maninuplate this under the mm sem.
>> +	 * Needed so I can call vm_ops->close.
>> +	 */
>> +	mm = vma->vm_mm;
>> +	atomic_inc(&mm->mm_users);
>> +	spin_unlock(&mapping->i_mmap_lock);
>> +
>> +	/* Block page faults and other code modifying the mm. */
>> +	down_write(&mm->mmap_sem);
>> +
>> +	/* Lookup a vma for my file address */
>> +	vma = find_vma(mm, start_addr);
>
> Then we look up a vma.  Is there reason to believe that this will
> differ from the incoming arg which we just overwrote?  Maybe the code
> is attempting to handle racing concurrent mmap/munmap activity?  If so,
> what are the implications of this?

Yes it is.  The file based index is only safe while we hold the i_mmap_lock.
The manipulation that needs to happen under the mmap_sem.

So I drop all of the locks and restart.  And use the time honored
kernel practice of relooking up the thing I am going to manipulate.
As long as it is for the same file I don't care.

> I _think_ that what the function is attempting to do is "unmap the vma
> which covers the address at vma->start_addr".  If so, why not just pass
> it that virtual address?

Actually it is  unmapping a vma for the file I am revoking.  I hand it
one and then it does an address space jig.

> Anyway, it's all a bit obscure and I do think that the semantics and
> behaviour should be carefully explained in a comment, no?
>
>> +	if (vma->vm_file != file)
>> +		goto out;
>
> This strengthens the theory that some sort of race-management is
> happening here.

Totally.  I dropped all of my locks so I am having to restart in
a different locking context.

>> +	start_addr = vma->vm_start;
>> +	end_addr   = vma->vm_end;
>> +	size	   = end_addr - start_addr;
>> +
>> +	/* Unlock the pages */
>> +	if (mm->locked_vm && (vma->vm_flags & VM_LOCKED)) {
>> +		mm->locked_vm -= vma_pages(vma);
>> +		vma->vm_flags &= ~VM_LOCKED;
>> +	}
>> +
>> +	/* Unmap the vma */
>> +	zap_page_range(vma, start_addr, size, NULL);
>> +
>> +	/* Unlink the vma from the file */
>> +	unlink_file_vma(vma);
>> +
>> +	/* Close the vma */
>> +	if (vma->vm_ops && vma->vm_ops->close)
>> +		vma->vm_ops->close(vma);
>> +	fput(vma->vm_file);
>> +	vma->vm_file = NULL;
>> +	if (vma->vm_flags & VM_EXECUTABLE)
>> +		removed_exe_file_vma(vma->vm_mm);
>> +
>> +	/* Repurpose the vma  */
>> +	vma->vm_private_data = NULL;
>> +	vma->vm_ops = &revoked_vm_ops;
>> +	vma->vm_flags &= ~(VM_NONLINEAR | VM_CAN_NONLINEAR);
>> +out:
>> +	up_write(&mm->mmap_sem);
>> +	spin_lock(&mapping->i_mmap_lock);
>> +}
>
> Also, I'm not a bit fan of the practice of overwriting the value of a
> formal argument, especially in a function which is this large and
> complex.  It makes the code harder to follow, because the one variable
> holds two conceptually different things within the span of the same
> function.  And it adds risk that someone will will later access a field
> of *vma and it will be the wrong vma.  Worse, the bug is only exposed
> under exeedingly rare conditions.
>
> So..  Use a new local, please.

We can never legitimately have more than one vma manipulated in this function.
As for the rest.  I guess I just assumed that the reader of the code would
have a basic understanding of the locking rules for those data structures.

Certainly the worst thing I suffer from is being close to the code,
and not realizing which pieces are not obvious to a naive observer.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux