On 7/17/23 23:47, Damien Le Moal wrote:
On 7/11/23 03:01, Bart Van Assche wrote:
Send commands that failed with an unaligned write error to the SCSI error
handler. Let the SCSI error handler sort SCSI commands per LBA before
resubmitting these.
Increase the number of retries for write commands sent to a sequential
zone to the maximum number of outstanding commands.
I think I mentioned this before. When we started btrfs work, we did something
similar (but at the IO scheduler level) to try to avoid adding a big lock in
btrfs to serialize (and thus order) writes. What we discovered is that it was
extremely easy to fall into a situation were the maximum number of possible
outstanding request is already issued, but they all are behind a "hole" and
indefinitely delayed because the missing request cannot be issued due to the max
nr request limit being reached. No forward progress and deadlock.
I do not see how your change addresses this problem. The same will happen with
this and I do not have any suggestion how to solve this. For btrfs, we ended up
using cone append emulation for scsi to avoid the big lock and avoid the FS from
having to order writes. That solution guarantees forward progress. Delaying
already issued writes that are not sequential has no such guarantees.
Hi Damien,
Thank you for having explained in detail the scenario that you ran into.
I think what has been explained above is a scenario in which the filesystem
allocates requests per zone in another order than the LBA order. How about
requiring that the filesystem allocates and submits zoned writes in LBA order
per zone? I think that this is how F2FS supports zoned storage.
Thanks,
Bart.