On Fri, Jul 26, 2013 at 6:21 PM, Chinmay V S <cvs268@xxxxxxxxx> wrote: > On Fri, Jul 26, 2013 at 12:02 PM, Kumar amit mehta <gmate.amit@xxxxxxxxx> wrote: >> On Fri, Jul 26, 2013 at 05:14:21PM +0800, Chinmay V S wrote: >>> > We have direct I/O(O_DIRECT), for example raw devices(/dev/rawctl) that >>> > map to the block devices and we also have page cache. Now If I've >>> > understood this correctly, direct I/O will bypass this page cache, which >>> > is fine, I'll not get into the performance debate, but what about data >>> > consistency. Kernel cannot and __should'nt__ try to control how the >>> > applications are being written. So one bad day somebody comes up with >>> > an application which does both these two types of IO(one that goes >>> > through page cache and the other that doesn't) and in that application, >>> > one instance is writing directly to the backend device and the other >>> > instance, who is not aware of this write, goes ahead and writes to the >>> > page cache, and that write would be written later to the backend device. >>> > So wouldn't we end up corrupting the on disk data. >>> >>> Yes. And that is the responsibility of the application. While the >>> existence of O_DIRECT may not be common sense, anyone who knows about >>> it *must* know that it bypasses the kernel page-cache and hence *must* >>> know the consequences of doing cached and direct I/O on the same file >>> simultaneously. >>> >>> > I can think of multiple other scenarios which could corrupt the on-disk >>> > data, if there isn't any safeguarding policies employed by the kernel. >>> > But I'm very much sure that kernel is aware of such nasty attempts, and >>> > I'd like to know how does kernel takes care of this. >>> >>> O_DIRECT is an explicit flag not enabled by default. >>> >>> It is the app's responsibility to ensure that it does NOT misuse the >>> feature. Essentially specifying the O_DIRECT flag is the app's way of >>> saying - "Hey kernel, i know what i am doing. Please step aside and >>> let me talk to the hardware directly. Please do NOT interfere." >>> >>> The kernel happily obliges. >>> >>> Later, the app should NOT go crying back to kernel (and blaming it), >>> if the app manages to screw-up the direct "relationship" with the >>> hardware. >> >> So leaving the hardware at the mercy of the application doesn't sound >> like a good practice. This __may__ compromise kernel stability too. Also >> think of this: >> >> In app1: >> fdx = open("blah" , O_RW|O_DIRECT); >> write(fdx,buf,sizeof(buf)); >> >> In app2(unaware of app1): >> fdy = open("blah", O_RW); >> write(fdy,buf, sizeof(buf)); >> >> I think this isn't highly unlikely to do, and if you agree with me then >> we may end up with same could-be/would-be data-corruption. Now who should >> be blamed here, app1, app2 or the kernel? Or it will be handled >> differently here? > > As long as both app1 and app2 are managing separate files (even on the > same underlying storage media), the situation looks good. > > From an app developer's perspective : > In case both the apps do I/O on the same file then it implies > knowledge of the other app. (Otherwise how would the second app know > that the file exists at such and such location?) And hence the second > app really ought to think about what it is going to do. > > case1: app1 uses regular I/O; > ==> app2 should NOT use direct I/O. > > case2: app1 uses direct I/O; > ==> app2 should NOT use regular I/O. > > From a kernel developer's perspective : > The kernel driver guarantees coherency between then page-cache and > data transferred using O_DIRECT. Refer to the page-15 of this deck[1] > that talks about the design of O_DIRECT. > > In either case the bigger problem lies in the fact that both the apps > need to work out a mutex mechanism to prevent the handful of > readers-writers problems[2] when both try to read/write from the same > file simultaneously. > > So it is more important(in fact, downright necessary) to ensure mutual > exclusion between the 2 apps during I/O. Otherwise one of them will > end-up overwriting the changes made by the other, unless both the apps > are doing ONLY read()s. > > [1] http://www.ukuug.org/events/linux2001/papers/html/AArcangeli-o_direct.html > [2] http://en.wikipedia.org/wiki/Readers-writers_problem > > > regards > ChinmayVS TL;DR 1. Do not worry about coherency between the page-cache and the data transferred using O_DIRECT. The kernel will invalidate the cache after an O_DIRECT write and flush the cache before an O_DIRECT read. 2. Use mutexes or semaphores(or any of the numerous options [1]) to prevent the usual synchronisation problems during IPC using a shared file. [1] http://beej.us/guide/bgipc/output/html/singlepage/bgipc.html regards ChinmayVS _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies