> > > At LSF this year, there was a discussion about the "wishlist" for > > > userland file servers. One of the things brought up was the goofy > > > and problematic behavior of POSIX locks when a file is closed. Boaz > > > started a thread on it here: > > > > > > http://permalink.gmane.org/gmane.linux.file-systems/73364 > > > > > > Userland fileservers often need to maintain more than one open file > > > descriptor on a file. The POSIX spec says: > > > > > > "All locks associated with a file for a given process shall be > > > removed when a file descriptor for that file is closed by that > > > process or the process holding that file descriptor terminates." > > > > > > This is problematic since you can't close any file descriptor > > > without dropping all your POSIX locks. Most userland file servers > > > therefore end up opening the file with more access than is really > > > necessary, and keeping fd's open for longer than is necessary to work > around this. > > > > > > This patchset is a first stab at an approach to address this problem > > > by adding two new l_type values -- F_RDLCKP and F_WRLCKP (the 'P' is > > > short for "private" -- I'm open to changing that if you have a > > > better mnemonic). > > > > > > For all intents and purposes these lock types act just like their > > > "non-P" counterpart. The difference is that they are only implicitly > > > released when the fd against which they were acquired is closed. As > > > a side effect, these locks cannot be merged with "non-P" locks since > > > they have different semantics on close. > > > > > > I've given this patchset some very basic smoke testing and it seems > > > to do the right thing, but it is still pretty rough. If this looks > > > reasonable I'll plan to do some documentation updates and will take > > > a stab at trying to get these new lock types added to the POSIX spec > > > (as HCH recommended). > > > > > > At this point, my main questions are: > > > > > > 1) does this look useful, particularly for fileserver implementors? > > > > > > 2) does this look OK API-wise? We could consider different "cmd" values > > > or even different syscalls, but I figured this makes it clearer that > > > "P" and "non-P" locks will still conflict with one another. > > > > > > Jeff Layton (5): > > > locks: consolidate checks for compatible filp->f_mode values in setlk > > > handlers > > > locks: add definitions for F_RDLCKP and F_WRLCKP > > > locks: skip FL_FILP_PRIVATE locks on close unless we're closing the > > > correct filp > > > locks: handle merging of locks when FL_FILP_PRIVATE is set > > > locks: show private lock types in /proc/locks > > > > I haven't looked at the patches, but it would be very good to have > > locks per "open" and not per "fd". > > > > My intent is to make it "per-filp" (aka "struct file") in the same way that > flock() locks are today. Note that the patchset posted so far doesn't quite > have the right semantics yet. > > Currently, I think that we want to give these locks flock-like inheritance and > close semantics, but to allow them to conflict with "legacy" POSIX range > locks. > > > What happens in this example? > > > > As I said, I haven't sat down to change the implementation yet, but I'll try to > answer this in the way that I think we'll want to do it... > > > fd1 = open("/somefile", ...); > > fd2 = open("/somefile", ...); > > fd3 = dup(fd1); > > > > At this point: > > fd1 = filp1 > fd2 = filp2 > fd3 = filp1 > > ...fd1 and fd3 both hold a reference to filp1. > > > lock(fd1, range1) > > lock(fd2, range2) > > lock(fd3, range3) > > > > I'll assume that lock() means setting a F_SETLK with F_WRLCKP > > > lock(fd2, range1) // => error already locked? > > > > Right. fd1 will hold the lock on range1 so -EAGAIN. > > > lock(fd3, range1) // stacked lock? > > > > Not stacked per-se, but replaced. Since fd1 == fd3, this lock() call won't > conflict and the new lock will replace the old one. Since the range is the same > though, there will be no real difference in the outcome. > > > close(fd1) > > > > fput(filp1), but fd3 still has a reference so the lock won't be released. > > > lock(fd2, range1) // is range1 still locked by fd3 ? > > > > Yep, still locked. > > > What about fd-passing, will the locks be transferred/shared with the > > other process? > > > > Yes, I think so. Locks will be passed to the other process in the same way that > flock() locks are today. AIUI, when you fork() you basically > dup() all the file descriptors of the parent so that's basically the same as what > happens above. > > Again though, I'm still trying to settle on what the semantics should be. None > of this is etched in stone yet. At a quick read, that sounds right to me, connect the locks to the kernel struct file (filp) and we will get the desirable semantics you describe and I think it will be easy to document the behavior. Frank -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html