Barry Scott wrote on Fri, Feb 02, 2024 at 08:07:46PM +0000: > > On 2 Feb 2024, at 17:58, Florian Weimer <fweimer@xxxxxxxxxx> wrote: > > The second one is a standard SATA drive in an USB enclosure, and those > > have write-reordering caches, as far as I understand it. > > We need a kernel storage expert to tell us the definitive truth on this stuff. > I may be out of date on this stuff. This isn't really my turf so I just "read the code" (as of 6.7), but... > What I understand is that the drive will be told via the appropriate SCSI(?) command > that it must not reorder the writes. Failure to implement that command means the > drive will not have WHQL from Microsoft. Without WHQL its very hard to sell a drive > in the market place. What seems to happen is the other way around: - the scsi layer sends a SENSE command to query what the drive supports, but that fails ("No Caching mode page found") - it then goes on to pick whatever default value is configured (litteraly assuming the drive works as per the default, hence the "Assuming drive cache: write through" message) - In this write through mode the 'WCE' flag is set to 0, which'll in turn configure the blk queue (linux side) to clear "QUEUE_FLAG_WC" and "QUEUE_FLAG_HW_WC" flags - Things are starting to get a little harder to follow from here, but it looks like that flush ops when QUEUE_FLAG_WC isn't set will clear REQ_PREFLUSH | REQ_FUA from the request, and if there was no data to write straight out skip the op... (submit_bio_noacct around the "Filter flush bio's early so that bio based drivers without flush support don't have to worry about them." comment - I definitely could be misunderstanding the code below) - I also don't see anything that'd tell the disk about our assumption -- there's a "cache_type_store" function that'll expose a cache_type sysfs knob for userspace to override, but at least kernel doesn't look like it'll send a scsi command to set it by default. Also, since we couldn't read the cache mode, there's no guarantee it'd be settable in the first place. So I think Florian is correct in that barriers won't be issued on these disks, and if they internally have such a cache it'd probably be unsafe... Now does the disk itself know that it's in such an enclosure and properly behaves as write through? I think we'd have a lot more corruptions on our hands if it was incorrect here, btrfs in particular is very sensitive to disks that lie with barriers but I'm not sure how much it's used on such drivers. I've had a quick look but didn't find any 'disk barrier sanity tool' that'd issue a succession of write + flush ops in an order that'd be easy to reorder (e.g. 13_______2) for one to unplug the disk and then check there was no hole after plugging it back in; if someone is aware of one that'd be interesting to test on such an enclosed HDD. (Of course, getting safe order back is no guarantee that the disk is always consistent, but it's probably possible to come up with a few patterns that often fail when manually misconfiguring a disk) -- Dominique Martinet | Asmadeus -- _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue