> However, my understanding is that filesystems need not maintain the relative > order of writes (as it received from vfs/kernel) on two different fds. Also, > if we have to maintain the order it might come with increased latency. The > increased latency can be because of having "newer" writes to wait on "older" > ones. This wait can fill up write-behind buffer and can eventually result in > a full write-behind cache and hence not able to "write-back" newer writes. IEEE 1003.1, 2013 edition http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html > After a write() to a regular file has successfully returned: > > Any successful read() from each byte position in the file that was > modified by that write shall return the data specified by the write() > for that position until >such byte positions are again modified. > > Any subsequent successful write() to the same byte position in the > file shall overwrite that file data. Note that the reference is to a *file*, not to a file *descriptor*. It's an application of the general POSIX assumption that time is simple, locking is cheap (if it's even necessary), and therefore time-based requirements like linearizability - what this is - are easy to satisfy. I know that's not very realistic nowadays, but it's pretty clear: according to the standard as it's still written, P2's write *is* required to overwrite P1's. Same vs. different fd or process/thread doesn't even come into play. Just for fun, I'll point out that the standard snippet above doesn't say anything about *non overlapping* writes. Does POSIX allow the following? write A write B read B, get new value read A, get *old* value This is a non-linearizable result, which would surely violate some people's (notably POSIX authors') expectations, but good luck finding anything in that standard which actually precludes it. _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel