On 22.01.2024 17:52, Zdenek Kabelac wrote:
Dne 22. 01. 24 v 14:46 Anthony Iliopoulos napsal(a):
On Mon, Jan 22, 2024 at 01:48:41PM +0100, Zdenek Kabelac wrote:
Dne 22. 01. 24 v 12:22 Su Yue napsal(a):
Hi lvm folks,
Recently We received a report about the device cache issue after
vgchange —deltag.
What confuses me is that lvm never calls fsync on block devices even
at the end of commit phase.
IIRC, it’s common operations for userspace tools to call
fsync/O_SYNC/O_DSYNC while writing
critical data. Yes, lvm2 opens devices with O_DIRECT if they support
, but O_DIRECT doesn't
provide data was persistent to storage when write returns. The data
can still be in the device cache,
If power failure happens in the timing, such critical metadata/data
like vg metadata could be lost.
Is there any particular reason not to flush data cache at VG commit
time?
Hi
It seems the call to 'dev_flush()' function got somehow lost over the
time
of conversion to async aio usage - I'll investigate.
On the other hand the chance here of losing any data this way would be
really really very specific to some oddly behaving device.
There's no guarantee that data will be persisted to storage without
explicitly flushing the device data cache. Those are usually volatile
write-back caches, so the data aren't really protected against power
loss without fsyncing the blockdev.
At technical level modern storage devices 'should' have enough energy
held internally to be able to flush out all the caches in emergency
cases to the persistent storage. So unless we deal with some 'virtual'
storage that may fake various responses to IO handling - this should not
be causing major troubles.
However it's clearly a problem which happened while the code has been
shifted towards the use of libaio.
Zdenek
More over. There is a very old post about fsync() lying.
https://brad.livejournal.com/2116715.html
I don’t know, maybe this is also a post-lie) Or now the devices have
become more truthful.
But many devices report that "Write cache" is enabled:
hdparm -I /dev/sda | grep 'Write cache'
* Write cache
And in many cases fsync() flushes data to write cache only.
But this can be persistent (ssd, flash) cache. Or as Zdenek has wrote,
"devices 'should' have enough energy held internally to be able to flush
out all the caches in in emergency cases".
However, in some cases, they may lose some data due to power failure and
large amount of dirty data in the cache, especially ordinary,
non-enterprise HDD. IMHO.
----