On 06/03/2021 02:52, Khazhy Kumykov wrote:
On Fri, Mar 5, 2021 at 7:20 AM John Garry <john.garry@xxxxxxxxxx> wrote:
It has been reported many times that a use-after-free can be intermittently
found when iterating busy requests:
- https://lore.kernel.org/linux-block/8376443a-ec1b-0cef-8244-ed584b96fa96@xxxxxxxxxx/
- https://lore.kernel.org/linux-block/5c3ac5af-ed81-11e4-fee3-f92175f14daf@xxxxxxx/T/#m6c1ac11540522716f645d004e2a5a13c9f218908
- https://lore.kernel.org/linux-block/04e2f9e8-79fa-f1cb-ab23-4a15bf3f64cc@xxxxxxxxx/
The issue is that when we switch scheduler or change queue depth, there may
be references in the driver tagset to the stale requests.
As a solution, clean up any references to those requests in the driver
tagset. This is done with a cmpxchg to make safe any race with setting the
driver tagset request from another queue.
I noticed this crash recently when running blktests on a "debug"
config on a 4.15 based kernel (it would always crash), and backporting
this change fixes it. (testing on linus's latest tree also confirmed
the fix, with the same config). I realize I'm late to the
conversation, but appreciate the investigation and fixes :)
Good to know. I'll explicitly cc you on further versions.
Thanks,
John