On 10/9/24 23:38, Reindl Harald wrote:
Am 09.10.24 um 22:08 schrieb Roland:
as proxmox hypervisor does not offer mdadm software raid at installation
time because of this bugticket
"MD RAID or DRBD can be broken from userspace when using O_DIRECT"
https://bugzilla.kernel.org/show_bug.cgi?id=99171
ps:
also see "qemu cache=none should not be used with mdadm"
https://bugzilla.proxmox.com/show_bug.cgi?id=5235
that all sounds like terrible nosense
if "Yes. O_DIRECT is really fundamentally broken. There's just no way to
fix it sanely. Except by teaching people not to use it, and making the
normal paths fast enough" it has to go away
it's not acceptable that userspace can break the integrity of the
underlying RAID - period
Take deep breath everyone.
Nothing has happened, nothing has been broken.
All systems continue to operate as normal.
If you look closely at the mentioned bug, you'll find that it does
modify the buffer at random times, in particular while it's being
written to disk.
Now, the boilerplate text for O_DIRECT says: the application is in
control of the data, and the data will be written without any caching.
Applying that to our testcase it means that the application _can_ modify
the data, even if it's in the process of being written to disk (zero
copy and all that).
We do guarantee that data is consistent once I/O is completed (here:
once 'write' returns), but we do not (and, in fact, cannot) guarantee
that data is consistent while write() is running.
Which means that the test case is actually invalid; you either would
need drop O_DIRECT or modify the buffer after write() to arrive with
a valid example.
That doesn't mean that I don't agree with the comments about O_DIRECT.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich