On Wed, 18 Feb 2004, Paul Starzetz wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Synopsis: Linux kernel do_mremap VMA limit local privilege escalation > vulnerability > Product: Linux kernel > Version: 2.2 up to 2.2.25, 2.4 up to 2.4.24, 2.6 up to 2.6.2 > Vendor: http://www.kernel.org/ > URL: http://isec.pl/vulnerabilities/isec-0014-mremap-unmap.txt > CVE: CAN-2004-0077 > Author: Paul Starzetz <ihaquer@isec.pl> > Date: February 18, 2004 > > > Issue: > ====== > > A critical security vulnerability has been found in the Linux kernel > memory management code inside the mremap(2) system call due to missing > function return value check. This bug is completely unrelated to the > mremap bug disclosed on 05-01-2004 except concerning the same internal > kernel function code. > > > Details: > ======== > > The Linux kernel manages a list of user addressable valid memory > locations on a per process basis. Every process owns a single linked > list of so called virtual memory area descriptors (called from now on > just VMAs). Every VMA describes the start of a valid memory region, its > length and moreover various memory flags like page protection. > > Every VMA in the list corresponds to a part of the process's page table. > The page table contains descriptors (in short page table entries PTEs) > of physical memory pages seen by the process. The VMA descriptor can be > thus understood as a high level description of a particular region of > the process's page table storing PTE properties like page R/W flag and > so on. > > The mremap() system call provides resizing (shrinking or growing) as > well as moving of existing virtual memory areas or any of its parts > across process's addressable space. > > Moving a part of the virtual memory from inside a VMA area to a new > location requires creation of a new VMA descriptor as well as copying > the underlying page table entries described by the VMA from the old to > the new location in the process's page table. > > To accomplish this task the do_mremap code calls the do_munmap() > internal kernel function to remove any potentially existing old memory > mapping in the new location as well as to remove the old virtual memory > mapping. Unfortunately the code doesn't test the return value of the > do_munmap() function which may fail if the maximum number of available > VMA descriptors has been exceeded. This happens if one tries to unmap > middle part of an existing memory mapping and the process's limit on the > number of VMAs has been reached (which is currently 65535). > > One of the possible situations can be illustrated with the following > picture. The corresponding page table entries (PTEs) have been marked > with o and x: > > Before mremap(): > > (oooooooooooooooooooooooo) (xxxxxxxxxxxx) > [----------VMA1----------] [----VMA2----] > [REMAPPED-VMA] <---------------| > > > After mremap() without VMA limit: > > (oooo)(xxxxxxxxxxxx)(oooo) > [VMA3][REMAPPED-VMA][VMA4] > > > After mremap() but VMA limit: > > (ooooxxxxxxxxxxxxxxoooo) > [---------VMA1---------] > [REMAPPED-VMA] > > > After the maximum number of VMAs in the process's VMA list has been > reached do_munmap() will refuse to create the necessary VMA hole because > it would split the original VMA in two disjoint VMA areas exceeding the > VMA descriptor limit. > > Due to the missing return value check after trying to unmap the middle > of the VMA1 (this is the first invocation of do_munmap inside do_mremap > code) the corresponding page table entries from VMA2 are still inserted > into the page table location described by VMA1 thus being subject to > VMA1 page protection flags. It must be also mentioned that the original > PTEs in the VMA1 are lost thus leaving the corresponding page frames > unusable for ever. > > The kernel also tries to insert the overlapping VMA area into the VMA > descriptor list but this fails due to further checks in the low level > VMA manipulation code. The low level VMA list check in the 2.4 and 2.6 > kernel versions just call BUG() therefore terminating the malicious > process. > > There are also two other unchecked calls to do_munmap() inside the > do_mremap() code and we believe that the second occurrence of unchecked > do_munmap is also exploitable. The second occurrence takes place if the > VMA to be remapped is beeing truncated in place. Note that do_munmap can > also fail on an exceptional low memory condition while trying to > allocate a VMA descriptor. > > We were able to create a robust proof-of-concept exploit code giving > full super-user privileges on all vulnerable kernel versions. The > exploit code will be released next week. > > > Impact: > ======= > > Since no special privileges are required to use the mremap(2) system > call any process may use its unexpected behavior to disrupt the kernel > memory management subsystem. > > Proper exploitation of this vulnerability leads to local privilege > escalation giving an attacker full super-user privileges. The > vulnerability may also lead to a denial-of-service attack on the > available system memory. > > Tested and known to be vulnerable kernel versions are all <= 2.2.25, <= > 2.4.24 and <= 2.6.1. The 2.2.25 version of Linux kernel does not > recognize the MREMAP_FIXED flag but this does not prevent the bug from > being successfully exploited. All users are encouraged to patch all > vulnerable systems as soon as appropriate vendor patches are released. > There is no hotfix for this vulnerablity. Limited per user virtual > memory still permits do_munmap() to fail. > > > Credits: > ======== > > Paul Starzetz <ihaquer@isec.pl> has identified the vulnerability and > performed further research. COPYING, DISTRIBUTION, AND MODIFICATION OF > INFORMATION PRESENTED HERE IS ALLOWED ONLY WITH EXPRESS PERMISSION OF > ONE OF THE AUTHORS. > > > Disclaimer: > =========== > > This document and all the information it contains are provided "as is", > for educational purposes only, without warranty of any kind, whether > express or implied. > > The authors reserve the right not to be responsible for the topicality, > correctness, completeness or quality of the information provided in > this document. Liability claims regarding damage caused by the use of > any information provided, including any kind of information which is > incomplete or incorrect, will therefore be rejected. > > - -- > Paul Starzetz > iSEC Security Research > http://isec.pl/ > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.0.7 (GNU/Linux) > > iD8DBQFAM1QzC+8U3Z5wpu4RAqXzAKCMOkFu1mXzzRgLyuFYp4ORpQCQDgCfe4M2 > 3IjbGvzniOjv/Hc7KKAzMtU= > =GJds > -----END PGP SIGNATURE----- > > Attached patch fixes this bug for kernel 2.2.25. It should also apply cleanly to kernels since at least 2.2.21. -- Sincerely Your, Dan.
--- linux/mm/mremap.c.security Sun Mar 25 20:31:03 2001 +++ linux/mm/mremap.c Thu Feb 19 05:10:34 2004 @@ -9,6 +9,7 @@ #include <linux/shm.h> #include <linux/mman.h> #include <linux/swap.h> +#include <linux/file.h> #include <asm/uaccess.h> #include <asm/pgtable.h> @@ -25,7 +26,7 @@ if (pgd_none(*pgd)) goto end; if (pgd_bad(*pgd)) { - printk("move_one_page: bad source pgd (%08lx)\n", pgd_val(*pgd)); + printk("copy_one_page: bad source pgd (%08lx)\n", pgd_val(*pgd)); pgd_clear(pgd); goto end; } @@ -34,7 +35,7 @@ if (pmd_none(*pmd)) goto end; if (pmd_bad(*pmd)) { - printk("move_one_page: bad source pmd (%08lx)\n", pmd_val(*pmd)); + printk("copy_one_page: bad source pmd (%08lx)\n", pmd_val(*pmd)); pmd_clear(pmd); goto end; } @@ -57,34 +58,22 @@ return pte; } -static inline int copy_one_pte(pte_t * src, pte_t * dst) +static int copy_one_page(struct mm_struct *mm, unsigned long old_addr, unsigned long new_addr) { - int error = 0; - pte_t pte = *src; + pte_t * src, * dst; - if (!pte_none(pte)) { - error++; - if (dst) { - pte_clear(src); - set_pte(dst, pte); - error--; + src = get_one_pte(mm, old_addr); + if (src && !pte_none(*src)) { + if ((dst = alloc_one_pte(mm, new_addr))) { + set_pte(dst, *src); + return 0; } + return 1; } - return error; -} - -static int move_one_page(struct mm_struct *mm, unsigned long old_addr, unsigned long new_addr) -{ - int error = 0; - pte_t * src; - - src = get_one_pte(mm, old_addr); - if (src) - error = copy_one_pte(src, alloc_one_pte(mm, new_addr)); - return error; + return 0; } -static int move_page_tables(struct mm_struct * mm, +static int copy_page_tables(struct mm_struct * mm, unsigned long new_addr, unsigned long old_addr, unsigned long len) { unsigned long offset = len; @@ -99,7 +88,7 @@ */ while (offset) { offset -= PAGE_SIZE; - if (move_one_page(mm, old_addr + offset, new_addr + offset)) + if (copy_one_page(mm, old_addr + offset, new_addr + offset)) goto oops_we_failed; } return 0; @@ -113,8 +102,6 @@ */ oops_we_failed: flush_cache_range(mm, new_addr, new_addr + len); - while ((offset += PAGE_SIZE) < len) - move_one_page(mm, new_addr + offset, old_addr + offset); zap_page_range(mm, new_addr, len); flush_tlb_range(mm, new_addr, new_addr + len); return -1; @@ -129,7 +116,9 @@ if (new_vma) { unsigned long new_addr = get_unmapped_area(addr, new_len); - if (new_addr && !move_page_tables(current->mm, new_addr, addr, old_len)) { + if (new_addr && !copy_page_tables(current->mm, new_addr, addr, old_len)) { + unsigned long ret; + *new_vma = *vma; new_vma->vm_start = new_addr; new_vma->vm_end = new_addr+new_len; @@ -138,9 +127,19 @@ new_vma->vm_file->f_count++; if (new_vma->vm_ops && new_vma->vm_ops->open) new_vma->vm_ops->open(new_vma); + if ((ret = do_munmap(addr, old_len))) { + if (new_vma->vm_ops && new_vma->vm_ops->close) + new_vma->vm_ops->close(new_vma); + if (new_vma->vm_file) + fput(new_vma->vm_file); + flush_cache_range(current->mm, new_addr, new_addr + old_len); + zap_page_range(current->mm, new_addr, old_len); + flush_tlb_range(current->mm, new_addr, new_addr + old_len); + kmem_cache_free(vm_area_cachep, new_vma); + return ret; + } insert_vm_struct(current->mm, new_vma); merge_segments(current->mm, new_vma->vm_start, new_vma->vm_end); - do_munmap(addr, old_len); current->mm->total_vm += new_len >> PAGE_SHIFT; if (new_vma->vm_flags & VM_LOCKED) { current->mm->locked_vm += new_len >> PAGE_SHIFT; @@ -176,9 +175,9 @@ * Always allow a shrinking remap: that just unmaps * the unnecessary pages.. */ - ret = addr; if (old_len >= new_len) { - do_munmap(addr+new_len, old_len - new_len); + if (!(ret = do_munmap(addr+new_len, old_len - new_len))) + ret = addr; goto out; }