Re: [PATCH v2] ioctl_getfsmap.2: document the GETFSMAP ioctl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/18/2017 04:07 AM, Darrick J. Wong wrote:
> Document the new GETFSMAP ioctl that returns the physical layout of a
> (disk-based) filesystem.

Thanks, Darrick! Applied (with a few minor edits). (Currently sitting in
a local branch, just in case anyone sends review comments that need
integrating.)

Cheers,

Michael

> Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> ---
> v2: emphasize that filesystems are not obligated to return inode numbers
> ---
>  man2/ioctl_getfsmap.2 |  375 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 375 insertions(+)
>  create mode 100644 man2/ioctl_getfsmap.2
> 
> diff --git a/man2/ioctl_getfsmap.2 b/man2/ioctl_getfsmap.2
> new file mode 100644
> index 0000000..b451950
> --- /dev/null
> +++ b/man2/ioctl_getfsmap.2
> @@ -0,0 +1,375 @@
> +.\" Copyright (c) 2017, Oracle.  All rights reserved.
> +.\"
> +.\" %%%LICENSE_START(GPLv2+_DOC_FULL)
> +.\" This is free documentation; you can redistribute it and/or
> +.\" modify it under the terms of the GNU General Public License as
> +.\" published by the Free Software Foundation; either version 2 of
> +.\" the License, or (at your option) any later version.
> +.\"
> +.\" The GNU General Public License's references to "object code"
> +.\" and "executables" are to be interpreted as the output of any
> +.\" document formatting or typesetting system, including
> +.\" intermediate and printed output.
> +.\"
> +.\" This manual is distributed in the hope that it will be useful,
> +.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
> +.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +.\" GNU General Public License for more details.
> +.\"
> +.\" You should have received a copy of the GNU General Public
> +.\" License along with this manual; if not, see
> +.\" <http://www.gnu.org/licenses/>.
> +.\" %%%LICENSE_END
> +.TH IOCTL-GETFSMAP 2 2017-02-10 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +ioctl_getfsmap \- retrieve the physical layout of the filesystem
> +.SH SYNOPSIS
> +.br
> +.B #include <sys/ioctl.h>
> +.br
> +.B #include <linux/fs.h>
> +.br
> +.B #include <linux/fsmap.h>
> +.sp
> +.BI "int ioctl(int " fd ", FS_IOC_GETFSMAP, struct fsmap_head * " arg );
> +.SH DESCRIPTION
> +This
> +.BR ioctl (2)
> +retrieves physical extent mappings for a filesystem.
> +This information can be used to discover which files are mapped to a physical
> +block, examine free space, or find known bad blocks, among other things.
> +
> +The sole argument to this ioctl should be a pointer to a single
> +.BR "struct fsmap_head" ":"
> +.in +4n
> +.nf
> +
> +struct fsmap {
> +	__u32		fmr_device;	/* device id */
> +	__u32		fmr_flags;	/* mapping flags */
> +	__u64		fmr_physical;	/* device offset of segment */
> +	__u64		fmr_owner;	/* owner id */
> +	__u64		fmr_offset;	/* file offset of segment */
> +	__u64		fmr_length;	/* length of segment */
> +	__u64		fmr_reserved[3];	/* must be zero */
> +};
> +
> +struct fsmap_head {
> +	__u32		fmh_iflags;	/* control flags */
> +	__u32		fmh_oflags;	/* output flags */
> +	__u32		fmh_count;	/* # of entries in array incl. input */
> +	__u32		fmh_entries;	/* # of entries filled in (output). */
> +	__u64		fmh_reserved[6];	/* must be zero */
> +
> +	struct fsmap	fmh_keys[2];	/* low and high keys for the mapping search */
> +	struct fsmap	fmh_recs[];	/* returned records */
> +};
> +
> +.fi
> +.in
> +The two
> +.I fmh_keys
> +array elements specify the lowest and highest reverse-mapping
> +keys, respectively, for which userspace would like physical mapping
> +information.
> +A reverse mapping key consists of the tuple (device, block, owner, offset).
> +The owner and offset fields are part of the key because some filesystems
> +support sharing physical blocks between multiple files and
> +therefore may return multiple mappings for a given physical block.
> +.PP
> +Filesystem mappings are copied into the
> +.I fmh_recs
> +array, which immediately follows the header data.
> +.SS Fields of struct fsmap_head
> +.PP
> +The
> +.I fmh_iflags
> +field is a bitmask passed to the kernel to alter the output.
> +There are no flags defined, so callers must set this value to zero.
> +
> +.PP
> +The
> +.I fmh_oflags
> +field is a bitmask of flags set by the kernel concerning the returned mappings.
> +If
> +.B FMH_OF_DEV_T
> +is set, then the
> +.I fmr_device
> +field represents a
> +.B dev_t
> +structure containing the major and minor numbers of the block device.
> +
> +.PP
> +The
> +.I fmh_count
> +field contains the number of elements in the array being passed to the
> +kernel.
> +If this value is 0,
> +.I fmh_entries
> +will be set to the number of records that would have been returned had
> +the array been large enough;
> +no mapping information will be returned.
> +
> +.PP
> +The
> +.I fmh_entries
> +field contains the number of elements in the
> +.I fmh_recs
> +array that contain useful information.
> +
> +.PP
> +The
> +.I fmh_reserved
> +fields must be set to zero.
> +
> +.SS Keys
> +.PP
> +The two key records in
> +.B fsmap_head.fmh_keys
> +specify the lowest and highest extent records in the keyspace that the caller
> +wants returned.
> +A filesystem that can share blocks between files likely requires the tuple
> +.RI "(" "device" ", " "physical" ", " "owner" ", " "offset" ", " "flags" ")"
> +to uniquely index any filesystem mapping record.
> +Classic non-sharing filesystems might be able to identify any record with only
> +.RI "(" "device" ", " "physical" ", " "flags" ")."
> +For example, if the low key is set to (8:0, 36864, 0, 0, 0), the filesystem will
> +only return records for extents starting at or above 36KiB on disk.
> +If the high key is set to (8:0, 1048576, 0, 0, 0), only records below 1MiB will
> +be returned.
> +The format of
> +.B fmr_device
> +in the keys must match the format of the same field in the output records,
> +as defined below.
> +By convention, the field
> +.B fsmap_head.fmh_keys[0]
> +must contain the low key and
> +.B fsmap_head.fmh_keys[1]
> +must contain the high key for the request.
> +.PP
> +For convenience, if
> +.B fmr_length
> +is set in the low key, it will be added to
> +.IR fmr_block " or " fmr_offset
> +as appropriate.
> +The caller can take advantage of this subtlety to set up subsequent calls
> +by copying
> +.B fsmap_head.fmh_recs[fsmap_head.fmh_entries - 1]
> +into the low key.
> +The function
> +.B fsmap_advance
> +provides this functionality.
> +
> +.SS Fields of struct fsmap
> +.PP
> +The
> +.I fmr_device
> +field uniquely identifies the underlying storage device.
> +If the
> +.B FMH_OF_DEV_T
> +flag is set in the header's
> +.I fmh_oflags
> +field, this field contains a
> +.B dev_t
> +from which major and minor numbers can be extracted.
> +If the flag is not set, this field contains a value that must be unique
> +for each unique storage device.
> +
> +.PP
> +The
> +.I fmr_physical
> +field contains the disk address of the extent in bytes.
> +
> +.PP
> +The
> +.I fmr_owner
> +field contains the owner of the extent.
> +This is an inode number unless
> +.B FMR_OF_SPECIAL_OWNER
> +is set in the
> +.I fmr_flags
> +field, in which case the value is determined by the filesystem.
> +See the section below about owner values for more details.
> +
> +.PP
> +The
> +.I fmr_offset
> +field contains the logical address in the mapping record in bytes.
> +This field has no meaning if the
> +.BR FMR_OF_SPECIAL_OWNER " or " FMR_OF_EXTENT_MAP
> +flags are set in
> +.IR fmr_flags "."
> +
> +.PP
> +The
> +.I fmr_length
> +field contains the length of the extent in bytes.
> +
> +.PP
> +The
> +.I fmr_flags
> +field is a bitmask of extent state flags.
> +The bits are:
> +.RS 0.4i
> +.TP
> +.B FMR_OF_PREALLOC
> +The extent is allocated but not yet written.
> +.TP
> +.B FMR_OF_ATTR_FORK
> +This extent contains extended attribute data.
> +.TP
> +.B FMR_OF_EXTENT_MAP
> +This extent contains extent map information for the owner.
> +.TP
> +.B FMR_OF_SHARED
> +Parts of this extent may be shared.
> +.TP
> +.B FMR_OF_SPECIAL_OWNER
> +The
> +.I fmr_owner
> +field contains a special value instead of an inode number.
> +.TP
> +.B FMR_OF_LAST
> +This is the last record in the filesystem.
> +.RE
> +
> +.PP
> +The
> +.I fmr_reserved
> +field will be set to zero.
> +
> +.SS Owner Values
> +Generally, the value of the
> +.I fmr_owner
> +field for non-metadata extents should be an inode number.
> +However, filesystems are under no obligation to report inode numbers;
> +they may instead report
> +.B FMR_OWN_UNKNOWN
> +if the inode number cannot easily be retrieved, if the caller lacks
> +sufficient privilege, if the filesystem does not support stable
> +inode numbers, or for any other reason.
> +If a filesystem wishes to condition the reporting of inode numbers based
> +on process capabilities, it is strongly urged that the
> +.B CAP_SYS_ADMIN
> +capability be used for this purpose.
> +.TP
> +The following special owner values are generic to all filesystems:
> +.RS 0.4i
> +.TP
> +.B FMR_OWN_FREE
> +Free space.
> +.TP
> +.B FMR_OWN_UNKNOWN
> +This extent is in use but its owner is not known or not easily retrieved.
> +.TP
> +.B FMR_OWN_METADATA
> +This extent is filesystem metadata.
> +.RE
> +
> +XFS can return the following special owner values:
> +.RS 0.4i
> +.TP
> +.B XFS_FMR_OWN_FREE
> +Free space.
> +.TP
> +.B XFS_FMR_OWN_UNKNOWN
> +This extent is in use but its owner is not known or not easily retrieved.
> +.TP
> +.B XFS_FMR_OWN_FS
> +Static filesystem metadata which exists at a fixed address.
> +These are the AG superblock, the AGF, the AGFL, and the AGI headers.
> +.TP
> +.B XFS_FMR_OWN_LOG
> +The filesystem journal.
> +.TP
> +.B XFS_FMR_OWN_AG
> +Allocation group metadata, such as the free space btrees and the
> +reverse mapping btrees.
> +.TP
> +.B XFS_FMR_OWN_INOBT
> +The inode and free inode btrees.
> +.TP
> +.B XFS_FMR_OWN_INODES
> +Inode records.
> +.TP
> +.B XFS_FMR_OWN_REFC
> +Reference count information.
> +.TP
> +.B XFS_FMR_OWN_COW
> +This extent is being used to stage a copy-on-write.
> +.TP
> +.B XFS_FMR_OWN_DEFECTIVE:
> +This extent has been marked defective either by the filesystem or the
> +underlying device.
> +.RE
> +
> +ext4 can return the following special owner values:
> +.RS 0.4i
> +.TP
> +.B EXT4_FMR_OWN_FREE
> +Free space.
> +.TP
> +.B EXT4_FMR_OWN_UNKNOWN
> +This extent is in use but its owner is not known or not easily retrieved.
> +.TP
> +.B EXT4_FMR_OWN_FS
> +Static filesystem metadata which exists at a fixed address.
> +This is the superblock and the group descriptors.
> +.TP
> +.B EXT4_FMR_OWN_LOG
> +The filesystem journal.
> +.TP
> +.B EXT4_FMR_OWN_INODES
> +Inode records.
> +.TP
> +.B EXT4_FMR_OWN_BLKBM
> +Block bitmap.
> +.TP
> +.B EXT4_FMR_OWN_INOBM
> +Inode bitmap.
> +.RE
> +
> +.SH RETURN VALUE
> +On error, \-1 is returned, and
> +.I errno
> +is set to indicate the error.
> +.PP
> +.SH ERRORS
> +Error codes can be one of, but are not limited to, the following:
> +.TP
> +.B EINVAL
> +The array is not long enough, the keys do not point to a valid part of
> +the filesystem, the low key points to a higher point in the filesystem's
> +physical storage address space than the high key, or a non-zero value
> +was passed in one of the fields that must be zero.
> +.TP
> +.B EFAULT
> +The pointer passed in was not mapped to a valid memory address.
> +.TP
> +.B EBADF
> +.IR fd
> +is not open for reading.
> +.TP
> +.B EOPNOTSUPP
> +The filesystem does not support this command.
> +.TP
> +.B EUCLEAN
> +The filesystem metadata is corrupt and needs repair.
> +.TP
> +.B EBADMSG
> +The filesystem has detected a checksum error in the metadata.
> +.TP
> +.B ENOMEM
> +Insufficient memory to process the request.
> +
> +.SH EXAMPLE
> +.TP
> +Please see io/fsmap.c in the xfsprogs distribution for a sample program.
> +
> +.SH CONFORMING TO
> +This API is Linux-specific.
> +Not all filesystems support it.
> +.fi
> +.in
> +.SH SEE ALSO
> +.BR ioctl (2)
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux