Hi,
在 2024/08/01 13:01, Greg Kroah-Hartman 写道:
On Wed, Jul 31, 2024 at 09:43:58PM +0200, Mateusz Jończyk wrote:
W dniu 30.07.2024 o 17:49, Greg Kroah-Hartman pisze:
6.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mateusz Jończyk <mat.jonczyk@xxxxx>
commit 36a5c03f232719eb4e2d925f4d584e09cfaf372c upstream.
Linux 6.9+ is unable to start a degraded RAID1 array with one drive,
when that drive has a write-mostly flag set. During such an attempt,
the following assertion in bio_split() is hit:
BUG_ON(sectors <= 0);
Call Trace:
? bio_split+0x96/0xb0
? exc_invalid_op+0x53/0x70
? bio_split+0x96/0xb0
? asm_exc_invalid_op+0x1b/0x20
? bio_split+0x96/0xb0
? raid1_read_request+0x890/0xd20
? __call_rcu_common.constprop.0+0x97/0x260
raid1_make_request+0x81/0xce0
? __get_random_u32_below+0x17/0x70
? new_slab+0x2b3/0x580
md_handle_request+0x77/0x210
md_submit_bio+0x62/0xa0
__submit_bio+0x17b/0x230
submit_bio_noacct_nocheck+0x18e/0x3c0
submit_bio_noacct+0x244/0x670
After investigation, it turned out that choose_slow_rdev() does not set
the value of max_sectors in some cases and because of it,
raid1_read_request calls bio_split with sectors == 0.
Fix it by filling in this variable.
This bug was introduced in
commit dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()")
but apparently hidden until
commit 0091c5a269ec ("md/raid1: factor out helpers to choose the best rdev from read_balance()")
shortly thereafter.
Cc: stable@xxxxxxxxxxxxxxx # 6.9.x+
Signed-off-by: Mateusz Jończyk <mat.jonczyk@xxxxx>
Fixes: dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()")
Cc: Song Liu <song@xxxxxxxxxx>
Cc: Yu Kuai <yukuai3@xxxxxxxxxx>
Cc: Paul Luse <paul.e.luse@xxxxxxxxxxxxxxx>
Cc: Xiao Ni <xni@xxxxxxxxxx>
Cc: Mariusz Tkaczyk <mariusz.tkaczyk@xxxxxxxxxxxxxxx>
Link: https://lore.kernel.org/linux-raid/20240706143038.7253-1-mat.jonczyk@xxxxx/
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Hello,
FYI there is a second regression in Linux 6.9 - 6.11, which occurs with RAID
component devices with a write-mostly flag when a new device is added
to the array. (A write-mostly flag on a device specifies that the kernel is to
avoid reading from such a device, if possible. It is enabled only manually with
a mdadm command line switch and can be beneficial when devices are of
different speed). The kernel than reads from the wrong component device
before it is synced, which may result in data corruption.
Link: https://lore.kernel.org/lkml/9952f532-2554-44bf-b906-4880b2e88e3a@xxxxx/T/
This is not caused by this patch, but only linked by similar functions and the
write-mostly flag being involved in both cases. The issue is that without this
patch, the kernel will fail to start or keep running a RAID array with a single
write-mostly device and the user will not be able to add another device to it,
which triggered the second regression.
Paul was of the opinion that this first patch should land nonetheless.
I would like you to decide whether to ship it now or defer it.
Is there a fix for this anywhere? If not, being in sync with Linus's
tree is probably the best solution for now.
The second regression is not related to this patch, and another fix
should be applied to mainline and then backport to stable, hence this
lts patch should be merged.
Thanks,
Kuai
thanks,
greg k-h
.