Re: status of bugzilla #99171 - mdraid broken for O_DIRECT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/9/24 23:38, Reindl Harald wrote:

Am 09.10.24 um 22:08 schrieb Roland:
as proxmox hypervisor does not offer mdadm software raid at installation
time because of this bugticket

"MD RAID or DRBD can be broken from userspace when using O_DIRECT"
https://bugzilla.kernel.org/show_bug.cgi?id=99171

ps:
also see "qemu cache=none should not be used with mdadm"
https://bugzilla.proxmox.com/show_bug.cgi?id=5235
that all sounds like terrible nosense

if "Yes. O_DIRECT is really fundamentally broken. There's just no way to fix it sanely. Except by teaching people not to use it, and making the normal paths fast enough" it has to go away

it's not acceptable that userspace can break the integrity of the underlying RAID - period

Take deep breath everyone.
Nothing has happened, nothing has been broken.
All systems continue to operate as normal.

If you look closely at the mentioned bug, you'll find that it does modify the buffer at random times, in particular while it's being written to disk. Now, the boilerplate text for O_DIRECT says: the application is in control of the data, and the data will be written without any caching.
Applying that to our testcase it means that the application _can_ modify
the data, even if it's in the process of being written to disk (zero copy and all that).
We do guarantee that data is consistent once I/O is completed (here:
once 'write' returns), but we do not (and, in fact, cannot) guarantee
that data is consistent while write() is running.

Which means that the test case is actually invalid; you either would need drop O_DIRECT or modify the buffer after write() to arrive with
a valid example.

That doesn't mean that I don't agree with the comments about O_DIRECT.

Cheers,

Hannes
--
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@xxxxxxx                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux