Re: Urgent Help Needed (regarding rbd cache)

Janne Johansson <icepic.dz@xxxxxxxxx> · Thu, 1 Aug 2019 08:45:26 +0200

Den tors 1 aug. 2019 kl 07:31 skrev Muhammad Junaid <junaid.fsd.pk@xxxxxxxxx>:
Your email has cleared many things to me. Let me repeat my understanding. Every Critical data (Like Oracle/Any Other DB) writes will be done with sync, fsync flags, meaning they will be only confirmed to DB/APP after it is actually written to Hard drives/OSD's. Any other application can do it also.
All other writes, like OS logs etc will be confirmed immediately to app/user but later on written  passing through kernel, RBD Cache, Physical drive Cache (If any)  and then to disks. These are susceptible to power-failure-loss but overall things are recoverable/non-critical. 

That last part is probably simplified a bit, I suspect between a program in a guest sending its data to the virtualised device, running in a KVM on top of an OS that has remote storage over network, to a storage server with its own OS and drive controller chip and lastly physical drive(s) to store the write, there will be something like ~10 layers of write caching possible, out of which the RBD you were asking about, is just one.

It is just located very conveniently before the I/O has to leave the KVM host and go back and forth over the network, so it is the last place where you can see huge gains in the guests I/O response time, but at the same time possible to share between lots of guests on the KVM host which should have tons of RAM available compared to any single guest so it is a nice way to get a large cache for outgoing writes.

Also, to answer your first part, yes all critical software that depend heavily on write ordering and integrity is hopefully already doing write operations that way, asking for sync(), fsync() or fdatasync() and similar calls, but I can't produce a list of all programs that do. Since there already are many layers of delayed cached writes even without virtualisation and/or ceph, applications that are important have mostly learned their lessons by now, so chances are very high that all your important databases and similar program are doing the right thing.

But if the guest is instead running a mail filter that does antivirus checks, spam checks and so on, operating on files that live on the machine for something like one second, and then either get dropped or sent to the destination mailbox somewhere else, then having aggressive write caches would be very useful, since the effects of a crash would still mostly mean "the emails that were in the queue were lost, not acked by the final mailserver and will probably be resent by the previous smtp server". For such a guest VM, forcing sync writes would only be a net loss, it would gain much by having large ram write caches. 

-- 
May the most significant bit of your life be positive.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com