Re: [PATCH] Revert "block: Do not discard buffers under a mounted filesystem"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021/02/17 2:51, Keith Busch wrote:
> On Tue, Feb 16, 2021 at 04:36:06PM +0000, Christoph Hellwig wrote:
>> On Tue, Feb 16, 2021 at 02:38:49PM +0100, Jan Kara wrote:
>>> Apparently there are several userspace programs that depend on being
>>> able to call BLKDISCARD ioctl without the ability to grab bdev
>>> exclusively - namely FUSE filesystems have the device open without
>>> O_EXCL (the kernel has the bdev open with O_EXCL) so the commit breaks
>>> fstrim(8) for such filesystems. Also LVM when shrinking LV opens PV and
>>> discards ranges released from LV but that PV may be already open
>>> exclusively by someone else (see bugzilla link below for more details).
>>>
>>> This reverts commit 384d87ef2c954fc58e6c5fd8253e4a1984f5fe02.
>>
>> I think that is a bad idea. We fixed the problem for a reason.
>> I think the right fix is to just do nothing if the device hasn't been
>> opened with O_EXCL and can't be reopened with it, just don't do anything
>> but also don't return an error.  After all discard and thus
>> BLKDISCARD is purely advisory.
> 
> A discard is advisory, but BLKZEROOUT is not, so something different
> should happen there. We were also planning to send a patch using this
> same pattern for Zone Reset to fix stale page cache issues after the
> reset, but we'll wait to see how this settles before sending that.

There is also another problem: the truncate_bdev & operation following it
(discard, zeroout or zone reset) are not atomic vs read/write operations to the
bdev. Without mutual exclusion, that page invalidation is best effort only since
reads can snick in between the truncate and discard (or zeroout or zone reset).
With our zone reset stale page problem case, it is reads from udevd that we see
snicking in between the truncate bdev and zone reset and so we still get stale
pages after the zone reset is finished. No solution to propose for solving that,
yet...



-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux