Il 18-08-2017 00:51 Wols Lists ha scritto:
Except that that is not what should be happening. I don't know my hard
drive details, but I believe drives have an instruction "async write
this data and let me know when you have done so".
This should NOT return "yes I've flushed it TO cache". Which is how you
get your problem - the level above thinks it's been safely flushed to
disk (because the disk has said "yes I've got it"), but it then gets
lost because of your power fluctuation. It should only acknowledge it
*after* it's been flushed *from* cache.
And this is apparently exactly what cheap drives do ...
If the level above says "tell me when it's safely on disk", and the
drive truly does as its told, your problem won't happen because the
disk
block layer will time out waiting for the acknowledgement and retry the
write.
SATA drives generally guarantee persistent storage on physical medium by
issuing *two* different FLUSH_CACHE commands, which do *not* form an
atomic operation. In other words, it's not a problem of "cheap drives"
or "lying hardware", rather, it seems a specific SATA limitation.
This means the problem can not be solved by simply "buying better
disks". Traditional flushing/barrier infrastructure simply has *no*
method to ensure an atomic commit at the hardware level, and if
something goes wrong between the two flushes, a (small) possibility
exists to have corrupted writes without I/O errors reported to the upper
layer, even in case of sync() writes. It's basically as a failing DRAM
cache, but with *no* real failures...
Newer drivers should implement FUAs, but I don't know if libata alredy
uses them by default. Anyway, the disk's firmware is free to split a
single FUA in more internal operations, so I am not sure they solves all
problems.
I really found the linux-scsi discussion interesting. Give it a look...
Regards.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html