Re: [PATCH v2 4/5] scsi: Retry unaligned zoned writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/19/23 07:53, Bart Van Assche wrote:
> On 7/17/23 23:47, Damien Le Moal wrote:
>> On 7/11/23 03:01, Bart Van Assche wrote:
>>> Send commands that failed with an unaligned write error to the SCSI
>>> error
>>> handler. Let the SCSI error handler sort SCSI commands per LBA before
>>> resubmitting these.
>>>
>>> Increase the number of retries for write commands sent to a sequential
>>> zone to the maximum number of outstanding commands.
>>
>> I think I mentioned this before. When we started btrfs work, we did
>> something
>> similar (but at the IO scheduler level) to try to avoid adding a big
>> lock in
>> btrfs to serialize (and thus order) writes. What we discovered is that
>> it was
>> extremely easy to fall into a situation were the maximum number of
>> possible
>> outstanding request is already issued, but they all are behind a
>> "hole" and
>> indefinitely delayed because the missing request cannot be issued due
>> to the max
>> nr request limit being reached. No forward progress and deadlock.
>>
>> I do not see how your change addresses this problem. The same will
>> happen with
>> this and I do not have any suggestion how to solve this. For btrfs, we
>> ended up
>> using cone append emulation for scsi to avoid the big lock and avoid
>> the FS from
>> having to order writes. That solution guarantees forward progress.
>> Delaying
>> already issued writes that are not sequential has no such guarantees.
> 
> Hi Damien,
> 
> Thank you for having explained in detail the scenario that you ran into.
> 
> I think what has been explained above is a scenario in which the filesystem
> allocates requests per zone in another order than the LBA order. How about
> requiring that the filesystem allocates and submits zoned writes in LBA
> order
> per zone? I think that this is how F2FS supports zoned storage.

Sure. But what if an application uses the drive directly ? You loose
guarantees of forward progress then. Given that an application has to
use direct IO for writes to sequential zones, this is unlikely to happen
in a "good" scenario, but it also would not be hard to write an
application that can deadlock the drive forever by simply missing one
write in a sequence of writes for a zone... That is my concern. While
f2fs would likely be OK, the delay approach is not solid enough for all
cases.



-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux