On Mon, Jul 9, 2018 at 3:16 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > On Mon, Jul 9, 2018 at 2:52 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >> On Mon, 2018-07-09 at 14:22 +0300, Amir Goldstein wrote: >>> On Mon, Jul 9, 2018 at 2:10 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >>> > On Mon, 2018-07-09 at 16:15 +0800, Eddie Horng wrote: >>> > > 2018-07-09 14:30 GMT+08:00 Amir Goldstein <amir73il@xxxxxxxxx>: >>> > > > I have no clue. >>> > > > Is the leaked lock and crash on the client or the server? >>> > > > If you can get an strace from the process that gets the Leaked message >>> > > > maybe it will give us a clue to the sort of file descriptor of the leaked >>> > > > file and how it was opened. >>> > > > Alternatively print the inode numbers and file types of flock calls to see >>> > > > where we have a mismatch. >>> > > > >>> > > > Thanks, >>> > > > Amir. >>> > > >>> > > Both the leaked lock and crash are on the server. >>> > > >>> > > I can emulate one of the lock failure case with a reproducer run along with >>> > > android building. The reproducer's behavior and result are very similar with >>> > > out/.lock generated by android build to control only one build process can >>> > > run on at the same time. In the first time (out/.lock is not exist), >>> > > flock works but a >>> > > "Leaked ..." message is supposed caused by it. After a round of build >>> > > completed, do a second build, the out/.lock is now failed to be locked. >>> > > The reproducer open and flock another file under out/ can reproduce the case. >>> > > Can this scenario help us to debug? >>> > > >>> > > process 1: process 2: >>> > > $ ~/flock/a.out /mnt/n/out/mylock >>> > > flock succeed, press any key to continue... >>> > > >>> > > $ cd /mnt/n && make -j12 # (build android) >>> > > close succeed >>> > > $ ~/flock/a.out /mnt/n/out/mylock >>> > > failed to lock file '/mnt/n/out/mylock': Resource temporarily unavailable >>> > > close succeed >>> > > >>> > > reproducer: >>> > > #include <stdio.h> >>> > > #include <sys/types.h> >>> > > #include <sys/stat.h> >>> > > #include <fcntl.h> >>> > > #include <unistd.h> >>> > > #include <sys/file.h> >>> > > #include <errno.h> >>> > > #include <string.h> >>> > > >>> > > int main(int argc, void **argv) { >>> > > char *filename=argv[1]; >>> > > int fd = open(filename, O_RDWR|O_CREAT, 0666); >>> > > int flock_result = flock(fd, LOCK_EX | LOCK_NB); >>> > > int err; >>> > > if (flock_result != 0) { >>> > > printf("failed to lock file '%s': %s\n", filename, strerror(errno)); >>> > > goto out; >>> > > } >>> > > printf("flock succeed, press any key to continue...\n"); >>> > > getchar(); >>> > > >>> > > out: >>> > > err = close(fd); >>> > > if (err == 0) >>> > > printf("close succeed\n"); >>> > > else >>> > > printf("failed to close %d: %s\n", fd, strerror(errno)); >>> > > } >>> > > >>> > >>> > This setup is pretty complicated. IIUC, you are exporting overlayfs via >>> > knfsd and then using the NFS client's flock emulation to map flock locks >>> > to POSIX ones. I think you probably want to simplify this reproducer a >>> > bit. >>> > >>> > Is it possible to reproduce this on a setup that doesn't have overlayfs >>> > involved, just to rule it in or out as a factor here? >>> > >>> > There are also a number of tracepoints in the posix locking code. It >>> > might be interesting to turn on the ones for posix_lock_inode and >>> > locks_remove_posix and and then run the reproducer to get a better idea >>> > of what's happening to those locks. >>> > >>> >>> Thanks for the suggestions Jeff. >>> >>> Eddie, >>> >>> This is NFS v4. Right? >> >> I think he said v3, which means NLM (lockd). >> >>> Do you wait until Android build completes before closing the >>> first reproducer fd? >>> >>> I suspect you can replace the effect of Android build with >>> drop_caches on the server. >>> >>> Jeff, >>> >>> Does knfsd hold a reference on the file/dentry/inode when >>> a lock is taken? >>> >> >> Given that he's using v3, it would actually be lockd in this case, but >> yes, it should hold a struct file reference by virtue of a nlm_file >> reference. >> >>> Assuming this is indeed a bug reproduced only with NFS+overlayfs >>> it sounds like overlay decode file handle fails to return the same >>> inode that knfd holds with the lock. >> >> That's a possibility. > > Oh oh! > > Those should be s/file_inode/locks_inode: > > static inline struct inode *nlmsvc_file_inode(struct nlm_file *file) > { > return file_inode(file->f_file); > } > > static inline int nlm_compare_locks(const struct file_lock *fl1, > const struct file_lock *fl2) > { > return file_inode(fl1->fl_file) == file_inode(fl2->fl_file) > ... > > Probably all the instances under fs/lockd/ as well... > and probably some in fs/nfsd > > Messy. > > I have a suggestion. > > Eddie, > > Do you mind trying out the code in Miklos' overlayfs-next branch Sorry wrong link: git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git > > This code gets rid of the file_inode/locks_inode divergion > so its a much healthier fix for the problem going forward. > > Jeff, > > What do you suggest that do with stable >= v4.16 kernels with > overlayfs NFS export. Do we have an easy way to disable > knfsd/lockd from using locks on exported overlayfs? > Maybe a sanity check (file_inode(file) == locks_inode(file)) > to avoid the leaks/crashes. > Where would be the most strategic point(s) to add such a check? > > Thanks, > Amir. -- To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html