On Sun, 17 Aug 2014 10:55:04 +0200 mdraid.pkoch@xxxxxxxx (Peter Koch) wrote: > Dear Neil, > > > That won't help. Data stored in kmalloc-256 won't get swapped out - it stays > > in RAM. So unless you can hot-plus 20Gig of RAM .... > > Thanks for the info. I read it when almost all my memory were in > kmalloc-256. Half an hour later my machine would have crashed despite > the increased swapspace. So I could do a graceful reboot and the reshape > has sucessfully finished in the meantime. > > Now I'm going to add those three drives to my array one by one. I'm doing > this because I cannot physically swap drive 13 and 14 (the next maintenance > window for such an operation would be in october). I will grow the array > to 14 drives today since my main concern is to put the data on an even > number of disks where the mirrors are separated correctly. > > Then I will add drive 14 and 15 in one step. > > By the way: Will a raid10 array with an even number of drives survife > if one half of the drives go offline during a reshape operation that > adds an even number of drives? Should do, yes. > > Should I download linux 3.14.17 sources and wait for a patch? If only > a missing kfree() has to be added somewhere I can do that by hand and > recompile 3.14.16. The following pair of patches should fix your problems. Should be easy to apply by hand to whatever kernel you want to use. > > Would it help you if I setup another machine and try to reproduce the > problem with linux 3.15.x, 3.16.x and 3.17.x? No thanks. memory leaks are quite easy to find - just enable CONFIG_DEBUG_KMEMLEAK and there they are.... I found about 4 but these are the only important ones. The second one might not seem so important from the description, but it is. Not freeing that memory causes it to be re-used in a slightly incorrect way. Thanks for the report. NeilBrown From 83a1ebfa292042b11b1e173b3fc50f243cb01c8b Mon Sep 17 00:00:00 2001 From: NeilBrown <neilb@xxxxxxx> Date: Mon, 18 Aug 2014 13:56:38 +1000 Subject: [PATCH] md/raid10: fix memory leak when reshaping a RAID10. raid10 reshape clears unwanted bits from a bio->bi_flags using a method which, while clumsy, worked until 3.10 when BIO_OWNS_VEC was added. Since then it clears that bit but shouldn't. This results in a memory leak. So change to used the approved method of clearing unwanted bits. As this causes a memory leak which can consume all of memory the fix is suitable for -stable. Fixes: a38352e0ac02dbbd4fa464dc22d1352b5fbd06fd Cc: stable@xxxxxxxxxxxxxxx (v3.10+) Reported-by: mdraid.pkoch@xxxxxxxx (Peter Koch) Signed-off-by: NeilBrown <neilb@xxxxxxx> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index b08c18871323..d9073a10f2f2 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -4410,7 +4410,7 @@ read_more: read_bio->bi_private = r10_bio; read_bio->bi_end_io = end_sync_read; read_bio->bi_rw = READ; - read_bio->bi_flags &= ~(BIO_POOL_MASK - 1); + read_bio->bi_flags &= (~0UL << BIO_RESET_BITS); read_bio->bi_flags |= 1 << BIO_UPTODATE; read_bio->bi_vcnt = 0; read_bio->bi_iter.bi_size = 0; From afad1968a35676fa39ebe64603ffd7fbf4ceea10 Mon Sep 17 00:00:00 2001 From: NeilBrown <neilb@xxxxxxx> Date: Mon, 18 Aug 2014 13:59:50 +1000 Subject: [PATCH] md/raid10: Fix memory leak when raid10 reshape completes. When a raid10 commences a resync/recovery/reshape it allocates some buffer space. When a resync/recovery completes the buffer space is freed. But not when the reshape completes. This can result in a small memory leak. Signed-off-by: NeilBrown <neilb@xxxxxxx> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index d9073a10f2f2..a46124ecafc7 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -2953,6 +2953,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, */ if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) { end_reshape(conf); + close_sync(conf); return 0; }
Attachment:
signature.asc
Description: PGP signature