Re: 4.11.2: reshape raid5 -> raid6 atop bcache deadlocks at start on md_attr_store / raid5_make_request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23 May 2017, NeilBrown outgrape:

> On Mon, May 22 2017, Nix wrote:
>
>> On 22 May 2017, Wols Lists verbalised:
>>
>> But it's only a few KiB by default! The amount of seeking needed to
>> reshape with such a small intermediate would be fairly horrific. (It was
>> bad enough as it was: the reshape of 7TiB took more than two days,
>> running at under 15MiB/s, though the component drives can all handle
>> 220MiB/s easily. The extra time was spent seeking to and from the
>> backup, it seems.)
>
> If the space before were "only a few KiB", it wouldn't be used.
> You need at least 1 full stripe, typically more.
> Current mdadm leaves several megabytes I think.

I was about to protest and say "oh but it doesn't"... but it helps if
I'm looking at the right machine. It does, but it didn't in 2009 :)

>> spindles will move the data offset such that it is (still) on a chunk or
>> stripe multiple? That's neat, if so, and means I wasted 128MiB on this,
>> uh, 12TiB array. OK I'm not terribly blown away by this, particularly
>> given that I'm wasting the same again inside the bcache partition for
>> the same reason: I'm sure mdadm won't move *that* data offset.)
>
> Data offset is always moved by a multiple of the chunk size.

Right, so the overlying fs might not be doing full-stripe writes after a
reshape, even if it thinks it is, but it will certainly be doing
chunk-multiple writes.

> When I create a 12-device raid5 on 1TB devices, then examine one of them,
> it says:
>
>     Data Offset : 262144 sectors
>    Unused Space : before=262064 sectors, after=0 sectors
>      Chunk Size : 512K

(which is 21.3... stripes.)

> so there is 130Megabytes of space per device, enough for 255 chunks.
> When mdadm moves the Data Offset to allow a reshape to happen without a
> backup file, it aims to use half the available space.  So it would use
> about 60Meg in about 120 chunks or 720Meg total across all devices.
> This is more than the 500MiB backup file you saw.

Right. The message in the manpage saying that backup files are required
for a level change is obsolete, then (and I probably slowed down my last
reshape by specifying one, since seeking to the backup file at the other
end of the disk would have been *much* slower than seeking to that slack
space before the data offset).
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux