On Wed, Sep 04, 2019 at 02:52:40PM +1000, Dave Chinner wrote: > On Mon, Aug 26, 2019 at 02:20:54PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > Use the fs and ag geometry ioctls to report health problems to users. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > --- > > include/xfrog.h | 2 > > libfrog/fsgeom.c | 11 + > > man/man8/xfs_spaceman.8 | 28 +++ > > spaceman/Makefile | 2 > > spaceman/health.c | 432 +++++++++++++++++++++++++++++++++++++++++++++++ > > spaceman/init.c | 1 > > spaceman/space.h | 1 > > 7 files changed, 476 insertions(+), 1 deletion(-) > > create mode 100644 spaceman/health.c > > > > > > diff --git a/include/xfrog.h b/include/xfrog.h > > index 5748e967..3a43a403 100644 > > --- a/include/xfrog.h > > +++ b/include/xfrog.h > > @@ -177,4 +177,6 @@ struct xfs_inogrp; > > int xfrog_inumbers(struct xfs_fd *xfd, uint64_t *lastino, uint32_t icount, > > struct xfs_inogrp *ubuffer, uint32_t *ocount); > > > > +int xfrog_ag_geometry(int fd, unsigned int agno, struct xfs_ag_geometry *ageo); > > + > > #endif /* __XFROG_H__ */ > > diff --git a/libfrog/fsgeom.c b/libfrog/fsgeom.c > > index 17479e4a..cddb5a39 100644 > > --- a/libfrog/fsgeom.c > > +++ b/libfrog/fsgeom.c > > @@ -131,3 +131,14 @@ xfrog_close( > > xfd->fd = -1; > > return ret; > > } > > + > > +/* Try to obtain an AG's geometry. */ > > +int > > +xfrog_ag_geometry( > > + int fd, > > + unsigned int agno, > > + struct xfs_ag_geometry *ageo) > > +{ > > + ageo->ag_number = agno; > > + return ioctl(fd, XFS_IOC_AG_GEOMETRY, ageo); > > Does it need this first: > > memset(ageo, 0, sizeof(*ageo)); > > Because I don't think the callers zero it.... Yep, and the caller was fixed up when I, er, twiddled the AG geometry ioctl definition last week. > > +static const struct flag_map inode_flags[] = { > > + { > > + .mask = XFS_BS_SICK_INODE, > > + .descr = "inode core", > > + }, > > + { > > + .mask = XFS_BS_SICK_BMBTD, > > + .descr = "data fork", > > + }, > > + { > > + .mask = XFS_BS_SICK_BMBTA, > > + .descr = "extended attribute fork", > > + }, > > + { > > + .mask = XFS_BS_SICK_BMBTC, > > + .descr = "copy on write fork", > > + }, > > + { > > + .mask = XFS_BS_SICK_DIR, > > + .descr = "directory", > > + }, > > + { > > + .mask = XFS_BS_SICK_XATTR, > > + .descr = "extended attributes", > > + }, > > + { > > + .mask = XFS_BS_SICK_SYMLINK, > > + .descr = "symbolic link target", > > + }, > > + { > > + .mask = XFS_BS_SICK_PARENT, > > + .descr = "parent pointers", > > This needs a "has_parent_pointers" feature check function, > doesn't it? Or is this already a valid status for directory inodes? It's already a valid status for directories, since xscrub checks that a directory's '..' points to another valid directory that points back. > > +static int > > +report_bulkstat_health( > > + xfs_agnumber_t agno) > > +{ > > + struct xfs_bstat bstat[128]; > > + char descr[256]; > > + uint64_t startino = 0; > > + uint64_t lastino = -1ULL; > > + uint32_t ocount; > > + uint32_t i; > > + int error; > > + > > + if (agno != NULLAGNUMBER) { > > + startino = xfrog_agino_to_ino(&file->xfd, agno, 0); > > + lastino = xfrog_agino_to_ino(&file->xfd, agno + 1, 0) - 1; > > + } > > + > > + while ((error = xfrog_bulkstat(&file->xfd, &startino, 128, bstat, > > Nit: use a define for the number of inodes to bulkstat. Ok. IIRC the xfrog_bulkstat conversion later on will zap most of this. > > +health_f( > > + int argc, > > + char **argv) > > +{ > > + unsigned long long x; > > + xfs_agnumber_t agno; > > + bool default_report = true; > > + int c; > > + int ret; > > + > > + reported = 0; > > + > > + if (file->xfd.fsgeom.version != XFS_FSOP_GEOM_VERSION_V5) { > > + perror("health"); > > + return 1; > > + } > > + > > + while ((c = getopt(argc, argv, "a:cfi:q")) != EOF) { > > + switch (c) { > > + case 'a': > > + default_report = false; > > + errno = 0; > > + x = strtoll(optarg, NULL, 10); > > + if (!errno && x >= NULLAGNUMBER) > > + errno = ERANGE; > > + if (errno) { > > + perror("ag health"); > > + return 1; > > + } > > + agno = x; > > + ret = report_ag_sick(agno); > > + if (!ret && comprehensive) > > + ret = report_bulkstat_health(agno); > > + if (ret) > > + return 1; > > + break; > > + case 'c': > > + comprehensive = true; > > + break; > > There's a command line ordering problem here - - "-a" and "-f" > check the comprehensive flag and do additional stuff based on it. > > So I think the -a and -f processing need to be done outside > the args processing loop, or we need two loops to extract the > -c first... > > Otherwise looks OK. Ok, will do. Thanks for the review. :) --D > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx