On Fri, Sep 9, 2022 at 10:56 PM Vincent Fu <vincent.fu@xxxxxxxxxxx> wrote: > You could test your theory about max_retries by creating an NVMe fabrics > loopback device backed by null_blk with error injection. Then try to access one > of the bad blocks via the nvme device and see if the delay before fio sees > the error depends on io_timeout and max_retries in the way that you expect. Oooh, that sounds great. Thanks for the suggestion. I'll get to it Monday if I don't find some time this weekend. Coincidentally, one of the things I found googling was someone using NVMe fabrics complaining that nvme_core/io_timeout and nvme_core/max_retries were not being honored. It was from 2019 but seemed relevant.(https://lore.kernel.org/all/EA2BFA4D4BAD49629F533A98F74DCE42@alyakaslap/T/#m26b5c91ec59de5159961a26a6cb0340c32a05ec9) I'll report back with what I see. Thanks, Nick