Re: memory leak with linux-3.14.16

NeilBrown <neilb@xxxxxxx> · Mon, 18 Aug 2014 15:01:57 +1000

On Sun, 17 Aug 2014 10:55:04 +0200 mdraid.pkoch@xxxxxxxx (Peter Koch) wrote:

> Dear Neil,
> 
> > That won't help.  Data stored in kmalloc-256 won't get swapped out - it stays
> > in RAM.  So unless you can hot-plus 20Gig of RAM ....
> 
> Thanks for the info. I read it when almost all my memory were in
> kmalloc-256. Half an hour later my machine would have crashed despite
> the increased swapspace. So I could do a graceful reboot and the reshape
> has sucessfully finished in the meantime.
> 
> Now I'm going to add those three drives to my array one by one. I'm doing
> this because I cannot physically swap drive 13 and 14 (the next maintenance
> window for such an operation would be in october). I will grow the array
> to 14 drives today since my main concern is to put the data on an even
> number of disks where the mirrors are separated correctly.
> 
> Then I will add drive 14 and 15 in one step.
> 
> By the way: Will a raid10 array with an even number of drives survife
> if one half of the drives go offline during a reshape operation that
> adds an even number of drives?

Should do, yes.

> 
> Should I download linux 3.14.17 sources and wait for a patch? If only
> a missing kfree() has to be added somewhere I can do that by hand and
> recompile 3.14.16.

The following pair of patches should fix your problems.  Should be easy to
apply by hand to whatever kernel you want to use.

> 
> Would it help you if I setup another machine and try to reproduce the
> problem with linux 3.15.x, 3.16.x and 3.17.x?

No thanks.  memory leaks are quite easy to find - just enable
CONFIG_DEBUG_KMEMLEAK and there they are....
I found about 4 but these are the only important ones.  The second one might
not seem so important from the description, but it is.  Not freeing that
memory causes it to be re-used in a slightly incorrect way.

Thanks for the report.

NeilBrown


From 83a1ebfa292042b11b1e173b3fc50f243cb01c8b Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@xxxxxxx>
Date: Mon, 18 Aug 2014 13:56:38 +1000
Subject: [PATCH] md/raid10: fix memory leak when reshaping a RAID10.

raid10 reshape clears unwanted bits from a bio->bi_flags using
a method which, while clumsy, worked until 3.10 when BIO_OWNS_VEC
was added.
Since then it clears that bit but shouldn't.  This results in a
memory leak.

So change to used the approved method of clearing unwanted bits.

As this causes a memory leak which can consume all of memory
the fix is suitable for -stable.

Fixes: a38352e0ac02dbbd4fa464dc22d1352b5fbd06fd
Cc: stable@xxxxxxxxxxxxxxx (v3.10+)
Reported-by: mdraid.pkoch@xxxxxxxx (Peter Koch)
Signed-off-by: NeilBrown <neilb@xxxxxxx>

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index b08c18871323..d9073a10f2f2 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4410,7 +4410,7 @@ read_more:
 	read_bio->bi_private = r10_bio;
 	read_bio->bi_end_io = end_sync_read;
 	read_bio->bi_rw = READ;
-	read_bio->bi_flags &= ~(BIO_POOL_MASK - 1);
+	read_bio->bi_flags &= (~0UL << BIO_RESET_BITS);
 	read_bio->bi_flags |= 1 << BIO_UPTODATE;
 	read_bio->bi_vcnt = 0;
 	read_bio->bi_iter.bi_size = 0;
From afad1968a35676fa39ebe64603ffd7fbf4ceea10 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@xxxxxxx>
Date: Mon, 18 Aug 2014 13:59:50 +1000
Subject: [PATCH] md/raid10: Fix memory leak when raid10 reshape completes.

When a raid10 commences a resync/recovery/reshape it allocates
some buffer space.
When a resync/recovery completes the buffer space is freed.  But not
when the reshape completes.
This can result in a small memory leak.

Signed-off-by: NeilBrown <neilb@xxxxxxx>

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index d9073a10f2f2..a46124ecafc7 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2953,6 +2953,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr,
 		 */
 		if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) {
 			end_reshape(conf);
+			close_sync(conf);
 			return 0;
 		}
 
Attachment:
signature.asc

Description: PGP signature