On Tue, Jul 2, 2024 at 11:18 PM Benjamin Marzinski <bmarzins@xxxxxxxxxx> wrote: > > When handling an IO request, MD checks if a reshape is currently > happening, and if so, where the IO sector is in relation to the reshape > progress. MD uses conf->reshape_progress for both of these tasks. When > the reshape finishes, conf->reshape_progress is set to MaxSector. If > this occurs after MD checks if the reshape is currently happening but > before it calls ahead_of_reshape(), then ahead_of_reshape() will end up > comparing the IO sector against MaxSector. During a backwards reshape, > this will make MD think the IO sector is in the area not yet reshaped, > causing it to use the previous configuration, and map the IO to the > sector where that data was before the reshape. > > This bug can be triggered by running the lvm2 > lvconvert-raid-reshape-linear_to_raid6-single-type.sh test in a loop, > although it's very hard to reproduce. > > Fix this by factoring the code that checks where the IO sector is in > relation to the reshape out to a helper called get_reshape_loc(), > which reads reshape_progress and reshape_safe while holding the > device_lock, and then rechecks if the reshape has finished before > calling ahead_of_reshape with the saved values. > > Also use the helper during the REQ_NOWAIT check to see if the location > is inside of the reshape region. > > Fixes: fef9c61fdfabf ("md/raid5: change reshape-progress measurement to cope with reshaping backwards.") > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> Applied to md-6.11. Thanks! Song