On 09/20/2016 02:31 PM, Anthony DeRobertis wrote:
Sorry for the amount of emails I'm sending, but I noticed something
that's probably important. I'm also appending some gdb log from
tracing through the function (trying to answer why it's doing cluster
mode stuff at all).
While tracing through, I noticed that *before* the write-bitmap loop,
mdadm -E considers the superblock valid. That agrees with what I saw
from strace, I suppose. To my first glance, it figures out how much to
write by calling this function:
static unsigned int calc_bitmap_size(bitmap_super_t *bms, unsigned int
boundary)
{
unsigned long long bits, bytes;
bits = __le64_to_cpu(bms->sync_size) /
(__le32_to_cpu(bms->chunksize)>>9);
bytes = (bits+7) >> 3;
bytes += sizeof(bitmap_super_t);
bytes = ROUND_UP(bytes, boundary);
return bytes;
}
That code looked familiar, and I figured out where—it's also in
95a05b37e8eb2bc0803b1a0298fce6adc60eff16, the commit that I found
originally broke it. But that commit is making a change to it: it
changed the ROUND_UP line from 512 to 4096 (and from the gdb trace,
boundary==4096).
I tested changing that line to "bytes = ROUND_UP(bytes, 512);", and it
works. Adds the new disk to the array and produces no warnings or errors.
I think it is is a coincidence that above change works, 4a3d29e commit made
the change but it didn't change the logic at all. Also seems the problem
is not
related to md-cluster code as your gdb debug shows it run into below part
because the version is 4.
/* no need to change bms->nodes for other bitmap types */
Thanks,
Guoqing
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html