Hi,
normally we do use write back cache on the RAID (which is backed up by
battery so should be ok...)
but in
order to verify this isn't the cause we switched to write-through and it didn't
help.
The
best behaviour I've seen so far was when I configured DATA="" but even
then it's all very timing-sensitive and I still get writes/closes which finish
ok but are not on the disk later on...
And
I've yet to see a real explanation/coverage of how journalling filesystems in
general, and EXT3 specifically, handle disk failure situations like power loss
or FC cable disconnection.
Currently our direction is to monitor the loop state (via /proc) and
initiate a killall on the application and umount once we see a loop down
indication.
A
brute-force test of this mechanism seems to work...
Anyhow
thanks for the sugggestion, will be happy to continue experimenting.
Yuval
|