Re: [PATCH] fix long-standing crash in 32bit migration

Jan Kiszka <jan.kiszka@xxxxxxxxxxx> · Wed, 27 Apr 2011 14:17:42 +0200

On 2011-04-26 18:13, Michael Tokarev wrote:
> This change fixes a long-standing immediate crash (memory corruption
> and abort in glibc malloc code) in migration on 32bits.
> 
> The bug is present since this commit:
> 
>   commit 692d9aca97b865b0f7903565274a52606910f129
>   Author: Bruce Rogers <brogers@xxxxxxxxxx>
>   Date:   Wed Sep 23 16:13:18 2009 -0600
> 
>     qemu-kvm: allocate correct size for dirty bitmap
> 
>     The dirty bitmap copied out to userspace is stored in a long array,
>     and gets copied out to userspace accordingly.  This patch accounts
>     for that correctly.  Currently I'm seeing kvm crashing due to writing
>     beyond the end of the alloc'd dirty bitmap memory, because the buffer
>     has the wrong size.
> 
>     Signed-off-by: Bruce Rogers
>     Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> 
>  --- a/qemu-kvm.c
>  +++ b/qemu-kvm.c
>  @@ int kvm_get_dirty_pages_range(kvm_context_t kvm, unsigned long phys_addr,
>  -            buf = qemu_malloc((slots[i].len / 4096 + 7) / 8 + 2);
>  +            buf = qemu_malloc(BITMAP_SIZE(slots[i].len));
>              r = kvm_get_map(kvm, KVM_GET_DIRTY_LOG, i, buf);
> 
> BITMAP_SIZE is now open-coded in that function, like this:
> 
>  size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS), HOST_LONG_BITS) / 8;
> 
> The problem is that HOST_LONG_BITS in 32bit userspace is 32
> but it's 64 in 64bit kernel.  So userspace aligns this to
> 32, and kernel to 64, but since no length is passed from
> userspace to kernel on ioctl, kernel uses its size calculation
> and copies 4 extra bytes to userspace, corrupting memory.
> 
> Here's how it looks like during migrate execution:
> 
> our=20, kern=24
> our=4, kern=8
> ...
> our=4, kern=8
> our=4064, kern=4064
> our=512, kern=512
> our=4, kern=8
> our=20, kern=24
> our=4, kern=8
> ...
> our=4, kern=8
> our=4064, kern=4064
> *** glibc detected *** ./x86_64-softmmu/qemu-system-x86_64: realloc(): invalid next size: 0x08f20528 ***
> 
> (our is userspace size above, kern is the size as calculated
> by the kernel).
> 
> Fix this by always aligning to 64 in a hope that no platform will
> have sizeof(long)>8 any time soon, and add a comment describing it
> all.  It's a small price to pay for bad kernel design.
> 
> Alternatively it's possible to fix that in the kernel by using
> different size calculation depending on the current process.
> But this becomes quite ugly.
> 
> Special thanks goes to Stefan Hajnoczi for spotting the fundamental
> cause of the issue, and to Alexander Graf for his support in #qemu.
> 
> Signed-off-by: Michael Tokarev <mjt@xxxxxxxxxx>
> CC: Bruce Rogers <brogers@xxxxxxxxxx>
> ---
>  kvm-all.c |   14 +++++++++++++-
>  1 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/kvm-all.c b/kvm-all.c
> index 7e407f0..3e75e9e 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -382,7 +382,19 @@ static int kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
>              break;
>          }
>  
> -        size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS), HOST_LONG_BITS) / 8;
> +	/* XXX bad kernel interface alert
> +         * For dirty bitmap, kernel allocates array of size aligned to
> +         * bits-per-long.  But for case when the kernel is 64bits and
> +         * the userspace is 32bits, userspace can't align to the same
> +         * bits-per-long, since sizeof(long) is different between kernel
> +         * and user space.  This way, userspace will provide buffer which
> +         * may be 4 bytes less than the kernel will use, resulting in
> +         * userspace memory corruption (which is not detectable by valgrind
> +         * too, in most cases).
> +         * So for now, let's align to 64 instead of HOST_LONG_BITS here, in
> +         * a hope that sizeof(long) wont become >8 any time soon.
> +         */
> +        size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS), /*HOST_LONG_BITS*/ 64) / 8;

-ELINETOOLONG
-ETABSALERT

Once fixed, it should also be queued in uq/master.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html