On Mon, 8 Apr 2013 13:25:43 +0300 Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote: > > In this topic we do not actually know what it will look like > but we know what we do not like. > > The most troublesome is the POSIX crap. POSIX says that if > any fd of a process on an inode is closed, all locks are lost, > even if we used another fd with other modes to acquire those > locks. (They had good stuff to smoke at when that stuff was > defined) > Jeremy Allison did some detective work on why this is: http://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html See the section on "First Implementation Past the Post". > This is real crap, because it completely kills our ability > to acquire some other resources on the file, and/or keep > correct access modes. Because as soon as we need a lock > we need to open the fd in read/write mode, because if in future > a clients need a write access, we cannot do a re-open of the > file, we will loose the locks. Now if we open in RW, then we will > immediately loose our delegations and also in PNFS a write-open > means a different thing then read-open. > > So what we urgently need is new locks API that is strictly > per fd. When we open an fd for read and then acquire a read > lock, we can continue to serve delegations. Only the close > of that specific fd will loose the locks. Any other parallel > activity in the background will not affect anything. > > We can craft an API that is very similar to today's API only > with the semantic changes. But we should also consider a > completely new API that can cover all the kind of locks, including > a notification API. Perhaps also unite that API with the delegations > API we want in the next topic. > > Thanks > Boaz > Perhaps we can simply add a new clone() flag (CLONE_SANELOCKS?) that indicates that the tasks in question want locks that are not affected by close() from other tasks? -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html