Re: [PATCH 1/1] xfs_spaceman: report health problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 04, 2019 at 02:52:40PM +1000, Dave Chinner wrote:
> On Mon, Aug 26, 2019 at 02:20:54PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > 
> > Use the fs and ag geometry ioctls to report health problems to users.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > ---
> >  include/xfrog.h         |    2 
> >  libfrog/fsgeom.c        |   11 +
> >  man/man8/xfs_spaceman.8 |   28 +++
> >  spaceman/Makefile       |    2 
> >  spaceman/health.c       |  432 +++++++++++++++++++++++++++++++++++++++++++++++
> >  spaceman/init.c         |    1 
> >  spaceman/space.h        |    1 
> >  7 files changed, 476 insertions(+), 1 deletion(-)
> >  create mode 100644 spaceman/health.c
> > 
> > 
> > diff --git a/include/xfrog.h b/include/xfrog.h
> > index 5748e967..3a43a403 100644
> > --- a/include/xfrog.h
> > +++ b/include/xfrog.h
> > @@ -177,4 +177,6 @@ struct xfs_inogrp;
> >  int xfrog_inumbers(struct xfs_fd *xfd, uint64_t *lastino, uint32_t icount,
> >  		struct xfs_inogrp *ubuffer, uint32_t *ocount);
> >  
> > +int xfrog_ag_geometry(int fd, unsigned int agno, struct xfs_ag_geometry *ageo);
> > +
> >  #endif	/* __XFROG_H__ */
> > diff --git a/libfrog/fsgeom.c b/libfrog/fsgeom.c
> > index 17479e4a..cddb5a39 100644
> > --- a/libfrog/fsgeom.c
> > +++ b/libfrog/fsgeom.c
> > @@ -131,3 +131,14 @@ xfrog_close(
> >  	xfd->fd = -1;
> >  	return ret;
> >  }
> > +
> > +/* Try to obtain an AG's geometry. */
> > +int
> > +xfrog_ag_geometry(
> > +	int			fd,
> > +	unsigned int		agno,
> > +	struct xfs_ag_geometry	*ageo)
> > +{
> > +	ageo->ag_number = agno;
> > +	return ioctl(fd, XFS_IOC_AG_GEOMETRY, ageo);
> 
> Does it need this first:
> 
> 	memset(ageo, 0, sizeof(*ageo));
> 
> Because I don't think the callers zero it....

Yep, and the caller was fixed up when I, er, twiddled the AG geometry
ioctl definition last week.

> > +static const struct flag_map inode_flags[] = {
> > +	{
> > +		.mask = XFS_BS_SICK_INODE,
> > +		.descr = "inode core",
> > +	},
> > +	{
> > +		.mask = XFS_BS_SICK_BMBTD,
> > +		.descr = "data fork",
> > +	},
> > +	{
> > +		.mask = XFS_BS_SICK_BMBTA,
> > +		.descr = "extended attribute fork",
> > +	},
> > +	{
> > +		.mask = XFS_BS_SICK_BMBTC,
> > +		.descr = "copy on write fork",
> > +	},
> > +	{
> > +		.mask = XFS_BS_SICK_DIR,
> > +		.descr = "directory",
> > +	},
> > +	{
> > +		.mask = XFS_BS_SICK_XATTR,
> > +		.descr = "extended attributes",
> > +	},
> > +	{
> > +		.mask = XFS_BS_SICK_SYMLINK,
> > +		.descr = "symbolic link target",
> > +	},
> > +	{
> > +		.mask = XFS_BS_SICK_PARENT,
> > +		.descr = "parent pointers",
> 
> This needs a "has_parent_pointers" feature check function,
> doesn't it? Or is this already a valid status for directory inodes?

It's already a valid status for directories, since xscrub checks that
a directory's '..' points to another valid directory that points back.

> > +static int
> > +report_bulkstat_health(
> > +	xfs_agnumber_t		agno)
> > +{
> > +	struct xfs_bstat	bstat[128];
> > +	char			descr[256];
> > +	uint64_t		startino = 0;
> > +	uint64_t		lastino = -1ULL;
> > +	uint32_t		ocount;
> > +	uint32_t		i;
> > +	int			error;
> > +
> > +	if (agno != NULLAGNUMBER) {
> > +		startino = xfrog_agino_to_ino(&file->xfd, agno, 0);
> > +		lastino = xfrog_agino_to_ino(&file->xfd, agno + 1, 0) - 1;
> > +	}
> > +
> > +	while ((error = xfrog_bulkstat(&file->xfd, &startino, 128, bstat,
> 
> Nit: use a define for the number of inodes to bulkstat.

Ok.  IIRC the xfrog_bulkstat conversion later on will zap most of this.

> > +health_f(
> > +	int			argc,
> > +	char			**argv)
> > +{
> > +	unsigned long long	x;
> > +	xfs_agnumber_t		agno;
> > +	bool			default_report = true;
> > +	int			c;
> > +	int			ret;
> > +
> > +	reported = 0;
> > +
> > +	if (file->xfd.fsgeom.version != XFS_FSOP_GEOM_VERSION_V5) {
> > +		perror("health");
> > +		return 1;
> > +	}
> > +
> > +	while ((c = getopt(argc, argv, "a:cfi:q")) != EOF) {
> > +		switch (c) {
> > +		case 'a':
> > +			default_report = false;
> > +			errno = 0;
> > +			x = strtoll(optarg, NULL, 10);
> > +			if (!errno && x >= NULLAGNUMBER)
> > +				errno = ERANGE;
> > +			if (errno) {
> > +				perror("ag health");
> > +				return 1;
> > +			}
> > +			agno = x;
> > +			ret = report_ag_sick(agno);
> > +			if (!ret && comprehensive)
> > +				ret = report_bulkstat_health(agno);
> > +			if (ret)
> > +				return 1;
> > +			break;
> > +		case 'c':
> > +			comprehensive = true;
> > +			break;
> 
> There's a command line ordering problem here - - "-a" and "-f" 
> check the comprehensive flag and do additional stuff based on it.
> 
> So I think the -a and -f processing need to be done outside
> the args processing loop, or we need two loops to extract the
> -c first...
> 
> Otherwise looks OK.

Ok, will do.  Thanks for the review. :)

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux