"Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes: > Hello Aneesh, > > After integrating review comments from NeilBown, Christoph Hellwig, > and Mike Frysinger, here is draft 4 of a man page I've written for > name_to_handle_at(2) and open_by_handle_at(2). (The changes since > draft 3 are only minor.) > > Would you be willing to review it please, and let me know of any > corrections/improvements? (Of course, further comments from anyone > else are also welcome.) > > There are some FIXMEs in the page that I would especially like some help with. > > Thanks, > > Michael > > > .\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages@xxxxxxxxx> > .\" > .\" %%%LICENSE_START(VERBATIM) > .\" Permission is granted to make and distribute verbatim copies of this > .\" manual provided the copyright notice and this permission notice are > .\" preserved on all copies. > .\" > .\" Permission is granted to copy and distribute modified versions of this > .\" manual under the conditions for verbatim copying, provided that the > .\" entire resulting derived work is distributed under the terms of a > .\" permission notice identical to this one. > .\" > .\" Since the Linux kernel and libraries are constantly changing, this > .\" manual page may be incorrect or out-of-date. The author(s) assume no > .\" responsibility for errors or omissions, or for damages resulting from > .\" the use of the information contained herein. The author(s) may not > .\" have taken the same level of care in the production of this manual, > .\" which is licensed free of charge, as they might when working > .\" professionally. > .\" > .\" Formatted or processed versions of this manual, if unaccompanied by > .\" the source, must acknowledge the copyright and authors of this work. > .\" %%%LICENSE_END > .\" > .TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual" > .SH NAME > name_to_handle_at, open_by_handle_at \- obtain handle > for a pathname and open file via a handle > .SH SYNOPSIS > .nf > .B #define _GNU_SOURCE > .B #include <sys/types.h> > .B #include <sys/stat.h> > .B #include <fcntl.h> > > .BI "int name_to_handle_at(int " dirfd ", const char *" pathname , > .BI " struct file_handle *" handle , > .BI " int *" mount_id ", int " flags ); > > .BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle , > .BI " int " flags ); > .fi > .SH DESCRIPTION > The > .BR name_to_handle_at () > and > .BR open_by_handle_at () > system calls split the functionality of > .BR openat (2) > into two parts: > .BR name_to_handle_at () > returns an opaque handle that corresponds to a specified file; > .BR open_by_handle_at () > opens the file corresponding to a handle returned by a previous call to > .BR name_to_handle_at () > and returns an open file descriptor. > .\" > .\" > .SS name_to_handle_at() > The > .BR name_to_handle_at () > system call returns a file handle and a mount ID corresponding to > the file specified by the > .IR dirfd > and > .IR pathname > arguments. > The file handle is returned via the argument > .IR handle , > which is a pointer to a structure of the following form: > > .in +4n > .nf > struct file_handle { > unsigned int handle_bytes; /* Size of f_handle [in, out] */ > int handle_type; /* Handle type [out] */ > unsigned char f_handle[0]; /* File identifier (sized by > caller) [out] */ > }; > .fi > .in > .PP > It is the caller's responsibility to allocate the structure > with a size large enough to hold the handle returned in > .IR f_handle . > Before the call, the > .IR handle_bytes > field should be initialized to contain the allocated size for > .IR f_handle . > (The constant > .BR MAX_HANDLE_SZ , > defined in > .IR <fcntl.h> , > specifies the maximum possible size for a file handle.) > Upon successful return, the > .IR handle_bytes > field is updated to contain the number of bytes actually written to > .IR f_handle . > > The caller can discover the required size for the > .I file_handle > structure by making a call in which > .IR handle->handle_bytes > is zero; > in this case, the call fails with the error > .BR EOVERFLOW > and > .IR handle->handle_bytes > is set to indicate the required size; > the caller can then use this information to allocate a structure > of the correct size (see EXAMPLE below). > > Other than the use of the > .IR handle_bytes > field, the caller should treat the > .IR file_handle > structure as an opaque data type: the > .IR handle_type > and > .IR f_handle > fields are needed only by a subsequent call to > .BR open_by_handle_at (). > > The > .I flags > argument is a bit mask constructed by ORing together zero or more of > .BR AT_EMPTY_PATH > and > .BR AT_SYMLINK_FOLLOW , > described below. > > Together, the > .I pathname > and > .I dirfd > arguments identify the file for which a handle is to be obtained. > There are four distinct cases: > .IP * 3 > If > .I pathname > is a nonempty string containing an absolute pathname, > then a handle is returned for the file referred to by that pathname. > In this case, > .IR dirfd > is ignored. > .IP * > If > .I pathname > is a nonempty string containing a relative pathname and > .IR dirfd > has the special value > .BR AT_FDCWD , > then > .I pathname > is interpreted relative to the current working directory of the caller, > and a handle is returned for the file to which it refers. > .IP * > If > .I pathname > is a nonempty string containing a relative pathname and > .IR dirfd > is a file descriptor referring to a directory, then > .I pathname > is interpreted relative to the directory referred to by > .IR dirfd , > and a handle is returned for the file to which it refers. > (See > .BR openat (3) > for an explanation of why "directory file descriptors" are useful.) > .IP * > If > .I pathname > is an empty string and > .I flags > specifies the value > .BR AT_EMPTY_PATH , > then > .IR dirfd > can be an open file descriptor referring to any type of file, > or > .BR AT_FDCWD , > meaning the current working directory, > and a handle is returned for the file to which it refers. > .PP > The > .I mount_id > argument returns an identifier for the filesystem > mount that corresponds to > .IR pathname . > This corresponds to the first field in one of the records in > .IR /proc/self/mountinfo . > Opening the pathname in the fifth field of that record yields a file > descriptor for the mount point; > that file descriptor can be used in a subsequent call to > .BR open_by_handle_at (). > > By default, > .BR name_to_handle_at () > does not dereference > .I pathname > if it is a symbolic link, and thus returns a handle for the link itself. > If > .B AT_SYMLINK_FOLLOW > is specified in > .IR flags , > .I pathname > is dereferenced if it is a symbolic link > (so that the call returns a handle for the file referred to by the link). > .SS open_by_handle_at() > The > .BR open_by_handle_at () > system call opens the file referred to by > .IR handle , > a file handle returned by a previous call to > .BR name_to_handle_at (). > > The > .IR mount_fd > argument is a file descriptor for any object (file, directory, etc.) > in the mounted filesystem with respect to which > .IR handle > should be interpreted. > The special value > .B AT_FDCWD > can be specified, meaning the current working directory of the caller. > > The > .I flags > argument > is as for > .BR open (2). > .\" FIXME: Confirm that the following is intended behavior. > .\" (It certainly seems to be the behavior, from experimenting.) > If > .I handle > refers to a symbolic link, the caller must specify the > .B O_PATH > flag, and the symbolic link is not dereferenced; the > .B O_NOFOLLOW > flag, if specified, is ignored. > > > The caller must have the > .B CAP_DAC_READ_SEARCH > capability to invoke > .BR open_by_handle_at (). > .SH RETURN VALUE > On success, > .BR name_to_handle_at () > returns 0, > and > .BR open_by_handle_at () > returns a nonnegative file descriptor. > > In the event of an error, both system calls return \-1 and set > .I errno > to indicate the cause of the error. > .SH ERRORS > .BR name_to_handle_at () > and > .BR open_by_handle_at () > can fail for the same errors as > .BR openat (2). > In addition, they can fail with the errors noted below. > > .BR name_to_handle_at () > can fail with the following errors: > .TP > .B EFAULT > .IR pathname , > .IR mount_id , > or > .IR handle > points outside your accessible address space. > .TP > .B EINVAL > .I flags > includes an invalid bit value. > .TP > .B EINVAL > .IR handle_bytes\->handle_bytes > is greater than > .BR MAX_HANDLE_SZ . > .TP > .B ENOENT > .I pathname > is an empty string, but > .BR AT_EMPTY_PATH > was not specified in > .IR flags . > .TP > .B ENOTDIR > The file descriptor supplied in > .I dirfd > does not refer to a directory, > and it is not the case that both > .I flags > includes > .BR AT_EMPTY_PATH > and > .I pathname > is an empty string. > .TP > .B EOPNOTSUPP > The filesystem does not support decoding of a pathname to a file handle. > .TP > .B EOVERFLOW > The > .I handle->handle_bytes > value passed into the call was too small. > When this error occurs, > .I handle->handle_bytes > is updated to indicate the required size for the handle. > .\" > .\" > .PP > .BR open_by_handle_at () > can fail with the following errors: > .TP > .B EBADF > .IR mount_fd > is not an open file descriptor. > .TP > .B EFAULT > .IR handle > points outside your accessible address space. > .TP > .B EINVAL > .I handle->handle_bytes > is greater than > .BR MAX_HANDLE_SZ > or is equal to zero. > .TP > .B ELOOP > .\" FIXME (see earlier FIXME). Is this the intended behavior? > .I handle > refers to a symbolic link, but > .B O_PATH > was not specified in > .IR flags . > .TP > .B EPERM > The caller does not have the > .BR CAP_DAC_READ_SEARCH > capability. > .TP > .B ESTALE > The specified > .I handle > is not valid. > This error will occur if, for example, the file has been deleted. > .SH VERSIONS > These system calls first appeared in Linux 2.6.39. > Library support is provided in glibc since version 2.14. > .SH CONFORMING TO > These system calls are nonstandard Linux extensions. > .SH NOTES > A file handle can be generated in one process using > .BR name_to_handle_at () > and later used in a different process that calls > .BR open_by_handle_at (). > > Some filesystem don't support the translation of pathnames to > file handles, for example, > .IR /proc , > .IR /sys , > and various network filesystems. > > A file handle may become invalid ("stale") if a file is deleted, > or for other filesystem-specific reasons. > Invalid handles are notified by an > .B ESTALE > error from > .BR open_by_handle_at (). > > These system calls are designed for use by user-space file servers. > For example, a user-space NFS server might generate a file handle > and pass it to an NFS client. > Later, when the client wants to open the file, > it could pass the handle back to the server. > .\" https://lwn.net/Articles/375888/ > .\" "Open by handle" - Jonathan Corbet, 2010-02-23 > This sort of functionality allows a user-space file server to operate in > a stateless fashion with respect to the files it serves. > > If > .I pathname > refers to a symbolic link and > .IR flags > does not specify > .BR AT_SYMLINK_FOLLOW , > then > .BR name_to_handle_at () > returns a handle for the link (rather than the file to which it refers). > .\" commit bcda76524cd1fa32af748536f27f674a13e56700 > The process receiving the handle can later perform operations > on the symbolic link by converting the handle to a file descriptor using > .BR open_by_handle_at () > with the > .BR O_PATH > flag, and then passing the file descriptor as the > .IR dirfd > argument in system calls such as > .BR readlinkat (2) > and > .BR fchownat (2). You may want to specify that one need to pass AT_EMPTY_PATH in case of fchownat ? readlinkat do take null names, because there is no flags argument. For syscalls that take flags, to make it operate on fd, one need to pass "" path name and a flag value of AT_EMPTY_PATH. > .SS Obtaining a persistent filesystem ID > The mount IDs in > .IR /proc/self/mountinfo > can be reused as filesystems are unmounted and mounted. > Therefore, the mount ID returned by > .BR name_to_handle_at () > (in > .IR *mount_id ) > should not be treated as a persistent identifier > for the corresponding mounted filesystem. > However, an application can use the information in the > .I mountinfo > record that corresponds to the mount ID > to derive a persistent identifier. > > For example, one can use the device name in the fifth field of the > .I mountinfo > record to search for the corresponding device UUID via the symbolic links in > .IR /dev/disks/by-uuid . > (A more comfortable way of obtaining the UUID is to use the > .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition > .BR libblkid (3) > library.) > That process can then be reversed, > using the UUID to look up the device name, > and then obtaining the corresponding mount point, > in order to produce the > .IR mount_fd > argument used by > .BR open_by_handle_at (). > .SH EXAMPLE > The two programs below demonstrate the use of > .BR name_to_handle_at () > and > .BR open_by_handle_at (). > The first program > .RI ( t_name_to_handle_at.c ) > uses > .BR name_to_handle_at () > to obtain the file handle and mount ID > for the file specified in its command-line argument; > the handle and mount ID are written to standard output. > > The second program > .RI ( t_open_by_handle_at.c ) > reads a mount ID and file handle from standard input. > The program then employs > .BR open_by_handle_at () > to open the file using that handle. > If an optional command-line argument is supplied, then the > .IR mount_fd > argument for > .BR open_by_handle_at () > is obtained by opening the directory named in that argument. > Otherwise, > .IR mount_fd > is obtained by scanning > .IR /proc/self/mountinfo > to find a record whose mount ID matches the mount ID > read from standard input, > and the mount directory specified in that record is opened. > (These programs do not deal with the fact that mount IDs are not persistent.) > > The following shell session demonstrates the use of these two programs: > > .in +4n > .nf > $ \fBecho 'Can you please think about it?' > cecilia.txt\fP > $ \fB./t_name_to_handle_at cecilia.txt > fh\fP > $ \fB./t_open_by_handle_at < fh\fP > open_by_handle_at: Operation not permitted > $ \fBsudo ./t_open_by_handle_at < fh\fP # Need CAP_SYS_ADMIN > Read 31 bytes > $ \fBrm cecilia.txt\fP > .fi > .in > > Now we delete and (quickly) re-create the file so that > it has the same content and (by chance) the same inode. > Nevertheless, > .BR open_by_handle_at () > .\" Christoph Hellwig: That's why the file handles contain a generation > .\" counter that gets incremented in this case. > recognizes that the original file referred to by the file handle > no longer exists. > > .in +4n > .nf > $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number > 4072121 > $ \fBrm cecilia.txt\fP > $ \fBecho 'Can you please think about it?' > cecilia.txt\fP > $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number > 4072121 > $ \fBsudo ./t_open_by_handle_at < fh\fP > open_by_handle_at: Stale NFS file handle > .fi > .in > .SS Program source: t_name_to_handle_at.c > \& > .nf > #define _GNU_SOURCE > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <stdio.h> > #include <stdlib.h> > #include <unistd.h> > #include <errno.h> > #include <string.h> > > #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ > } while (0) > > int > main(int argc, char *argv[]) > { > struct file_handle *fhp; > int mount_id, fhsize, flags, dirfd, j; > char *pathname; > > if (argc != 2) { > fprintf(stderr, "Usage: %s pathname\\n", argv[0]); > exit(EXIT_FAILURE); > } > > pathname = argv[1]; > > /* Allocate file_handle structure */ > > fhsize = sizeof(*fhp); > fhp = malloc(fhsize); > if (fhp == NULL) > errExit("malloc"); > > /* Make an initial call to name_to_handle_at() to discover > the size required for file handle */ > > dirfd = AT_FDCWD; /* For name_to_handle_at() calls */ > flags = 0; /* For name_to_handle_at() calls */ > fhp\->handle_bytes = 0; > if (name_to_handle_at(dirfd, pathname, fhp, > &mount_id, flags) != \-1 || errno != EOVERFLOW) { > fprintf(stderr, "Unexpected result from name_to_handle_at()\\n"); > exit(EXIT_FAILURE); > } > > /* Reallocate file_handle structure with correct size */ > > fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; > fhp = realloc(fhp, fhsize); /* Copies fhp\->handle_bytes */ > if (fhp == NULL) > errExit("realloc"); > > /* Get file handle from pathname supplied on command line */ > > if (name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags) == \-1) > errExit("name_to_handle_at"); > > /* Write mount ID, file handle size, and file handle to stdout, > for later reuse by t_open_by_handle_at.c */ > > printf("%d\\n", mount_id); > printf("%d %d ", fhp\->handle_bytes, fhp\->handle_type); > for (j = 0; j < fhp\->handle_bytes; j++) > printf(" %02x", fhp\->f_handle[j]); > printf("\\n"); > > exit(EXIT_SUCCESS); > } > .fi > .SS Program source: t_open_by_handle_at.c > \& > .nf > #define _GNU_SOURCE > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <limits.h> > #include <stdio.h> > #include <stdlib.h> > #include <unistd.h> > #include <string.h> > > #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ > } while (0) > > /* Scan /proc/self/mountinfo to find the line whose mount ID matches > \(aqmount_id\(aq. (An easier way to do this is to install and use the > \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.) > Open the corresponding mount path and return the resulting file > descriptor. */ > > static int > open_mount_path_by_id(int mount_id) > { > char *linep; > size_t lsize; > char mount_path[PATH_MAX]; > int mi_mount_id, found; > ssize_t nread; > FILE *fp; > > fp = fopen("/proc/self/mountinfo", "r"); > if (fp == NULL) > errExit("fopen"); > > found = 0; > linep = NULL; > while (!found) { > nread = getline(&linep, &lsize, fp); > if (nread == \-1) > break; > > nread = sscanf(linep, "%d %*d %*s %*s %s", > &mi_mount_id, mount_path); > if (nread != 2) { > fprintf(stderr, "Bad sscanf()\\n"); > exit(EXIT_FAILURE); > } > > if (mi_mount_id == mount_id) > found = 1; > } > free(linep); > > fclose(fp); > > if (!found) { > fprintf(stderr, "Could not find mount point\\n"); > exit(EXIT_FAILURE); > } > > return open(mount_path, O_RDONLY); > } > > int > main(int argc, char *argv[]) > { > struct file_handle *fhp; > int mount_id, fd, mount_fd, handle_bytes, j; > ssize_t nread; > char buf[1000]; > #define LINE_SIZE 100 > char line1[LINE_SIZE], line2[LINE_SIZE]; > char *nextp; > > if ((argc > 1 && strcmp(argv[1], "\-\-help") == 0) || argc > 2) { > fprintf(stderr, "Usage: %s [mount\-path]\\n", argv[0]); > exit(EXIT_FAILURE); > } > > /* Standard input contains mount ID and file handle information: > > Line 1: <mount_id> > Line 2: <handle_bytes> <handle_type> <bytes of handle in hex> > */ > > if ((fgets(line1, sizeof(line1), stdin) == NULL) || > (fgets(line2, sizeof(line2), stdin) == NULL)) { > fprintf(stderr, "Missing mount_id / file handle\\n"); > exit(EXIT_FAILURE); > } > > mount_id = atoi(line1); > > handle_bytes = strtoul(line2, &nextp, 0); > > /* Given handle_bytes, we can now allocate file_handle structure */ > > fhp = malloc(sizeof(struct file_handle) + handle_bytes); > if (fhp == NULL) > errExit("malloc"); > > fhp\->handle_bytes = handle_bytes; > > fhp\->handle_type = strtoul(nextp, &nextp, 0); > > for (j = 0; j < fhp\->handle_bytes; j++) > fhp\->f_handle[j] = strtoul(nextp, &nextp, 16); > > /* Obtain file descriptor for mount point, either by opening > the pathname specified on the command line, or by scanning > /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq > that we received from stdin. */ > > if (argc > 1) > mount_fd = open(argv[1], O_RDONLY); > else > mount_fd = open_mount_path_by_id(mount_id); > > if (mount_fd == \-1) > errExit("opening mount fd"); > > /* Open file using handle and mount point */ > > fd = open_by_handle_at(mount_fd, fhp, O_RDONLY); > if (fd == \-1) > errExit("open_by_handle_at"); > > /* Try reading a few bytes from the file */ > > nread = read(fd, buf, sizeof(buf)); > if (nread == \-1) > errExit("read"); > > printf("Read %zd bytes\\n", nread); > > exit(EXIT_SUCCESS); > } > .fi > .SH SEE ALSO > .BR open (2), > .BR libblkid (3), > .BR blkid (8), > .BR findfs (8), > .BR mount (8) > > The > .I libblkid > and > .I libmount > documentation in the latest > .I util-linux > release at > .UR https://www.kernel.org/pub/linux/utils/util-linux/ > .UE > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html