On Wed, Oct 20, 2010 at 04:15:04AM +0100, Al Viro wrote: > On Tue, Oct 19, 2010 at 01:11:33PM -0600, Matthew Wilcox wrote: > > Hm. Sounds like the same question that the file leases code needs > > answered. The important difference is that the leases code can just > > refuse to set a lease on inodes with multiple dentries. > > > > While my mind's on it ... Al, is this code even close to correct? > > > > if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) > > goto out; > > if ((arg == F_WRLCK) > > && ((atomic_read(&dentry->d_count) > 1) > > || (atomic_read(&inode->i_count) > 1))) > > goto out; > > No. This is complete junk; note that e.g. ls -lR will disrupt it, since > lstat(2) will bump dentry refcount. The first part is more or less OK; > the second makes no sense. It may be nobody cares about false positives--if leases are something provided as a caching optimization, then it's OK to deny them more often than strictly necessary. (In fact, if you're using the write lease to allow somebody else to cache writes, then you want the stat to break the lease so it can get can trigger a cache flush and get an updated mtime.) Not that I'll claim those checks are correct. > What is it trying to do? Note that the first part also doesn't make a lot > of sense, since you could be acquiring a write reference *right* *now*, > just as that check passes. And you could finish getting it before you get > to do anything else in generic_setlease(). Yeah, that's a race. The lease code is broken. On my list of problems: - Leases are broken only by conflicting opens; but both nfsv4 delegations and (I'm told) Windows op locks actually require that read leases be broken on any operation that changes file metadata--including unlink, link, rename, chmod, and chown. - The internal kernel api used for lease-breaking is inherently racy as long as it consists of a single break_lease() call. (We probably need deny_leases() and undeny_leases(), or something.) I'd be happy just to have read leases that worked correctly; write leases (which, at least for the NFSv4 server's purposes, also need to be broken on *access* to metadata) are worse. I have patches that attempted to replace the current mechanism for F_RDLCK leases with an i_leasecount modeled on i_writecount. (So positive counts the number of leases held, negative counts operations temporarily disabling leases.) They were never completely correct. I also wasn't sure what sort of requirements I should meet to avoid noticeable scalability problems (how much additional space in the inode could I get away with? What about additional locks, or just writes to inode fields, on read opens?). --b. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html