Re: [PATCH 5/8] xfs_db: check that metadata updates have been committed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 20, 2020 at 10:34:50AM -0800, Darrick J. Wong wrote:
> On Thu, Feb 20, 2020 at 12:58:57PM -0500, Brian Foster wrote:
> > On Thu, Feb 20, 2020 at 08:58:40AM -0800, Darrick J. Wong wrote:
> > > On Thu, Feb 20, 2020 at 09:06:23AM -0500, Brian Foster wrote:
> > > > On Wed, Feb 19, 2020 at 05:42:13PM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > > 
> > > > > Add a new function that will ensure that everything we scribbled on has
> > > > > landed on stable media, and report the results.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > > ---
> > > > >  db/init.c |   14 ++++++++++++++
> > > > >  1 file changed, 14 insertions(+)
> > > > > 
> > > > > 
> > > > > diff --git a/db/init.c b/db/init.c
> > > > > index 0ac37368..e92de232 100644
> > > > > --- a/db/init.c
> > > > > +++ b/db/init.c
> > > > > @@ -184,6 +184,7 @@ main(
> > > > >  	char	*input;
> > > > >  	char	**v;
> > > > >  	int	start_iocur_sp;
> > > > > +	int	d, l, r;
> > > > >  
> > > > >  	init(argc, argv);
> > > > >  	start_iocur_sp = iocur_sp;
> > > > > @@ -216,6 +217,19 @@ main(
> > > > >  	 */
> > > > >  	while (iocur_sp > start_iocur_sp)
> > > > >  		pop_cur();
> > > > > +
> > > > > +	libxfs_flush_devices(mp, &d, &l, &r);
> > > > > +	if (d)
> > > > > +		fprintf(stderr, _("%s: cannot flush data device (%d).\n"),
> > > > > +				progname, d);
> > > > > +	if (l)
> > > > > +		fprintf(stderr, _("%s: cannot flush log device (%d).\n"),
> > > > > +				progname, l);
> > > > > +	if (r)
> > > > > +		fprintf(stderr, _("%s: cannot flush realtime device (%d).\n"),
> > > > > +				progname, r);
> > > > > +
> > > > > +
> > > > 
> > > > Seems like we could reduce some boilerplate by passing progname into
> > > > libxfs_flush_devices() and letting it dump out of the error messages,
> > > > unless there's some future code that cares about individual device error
> > > > state.
> > > 
> > > Such a program could call libxfs_flush_devices directly, as we do here.
> > > 
> > 
> > Right.. but does anything actually care about that level of granularity
> > right now beyond having a nicer error message?
> 
> No, afaict.
> 
> > > Also, progname is defined in libxfs so we don't even need to pass it as
> > > an argument.
> > > 
> > 
> > Ok.
> > 
> > > I had originally thought that we should try not to add fprintf calls to
> > > libxfs because libraries aren't really supposed to be doing things like
> > > that, but perhaps you're right that all of this should be melded into
> > > something else.
> > > 
> > 
> > Yeah, fair point, though I guess it depends on the particular library. 
> 
> I mean... is libxfs even a real library? :)

It's an internal abstraction to allow code to be shared easily with
the kernel and between xfsprogs binaries. It is not a library in the
sense it has a fixed API and ABI that compatibility has to be
maintained across releases (i.e. like, say, libhandle). It is a
library in the sense is contains shared code that things within the
build don't have to re-implement them over and over again, and
because it is internal that means the rules for external/distro
level libraries don't need to be strictly applied.

e.g. if -everything- uses a global variable and has to declares it
themselves so the shared internal code can access it, then why not
just declare it in the shared code? :)

> > > I dunno.  My current thinking is that libxfs_umount should call
> > > libxfs_flush_devices() and print error messages as necessary, and return
> > > error codes as appropriate.  xfs_repair can then check the umount return
> > > value and translate that into exit(1) as required.  The device_close
> > > functions will fsync a second time, but that shouldn't be a big deal
> > > because we haven't dirtied anything in the meantime.
> > > 
> > > Thoughts?
> > > 
> > 
> > I was thinking of having a per-device libxfs_device_flush() along the
> > lines of libxfs_device_close() and separating out that functionality,
> > but one could argue we're also a bit inconsistent between libxfs_init()
> > opening the devices and having to close them individually.
> 
> Yeah, I don't understand why libxfs_destroy doesn't empty out the same
> struct libxfs_init that libxfs_init populates.  Or why we have a global
> variable named "x", or why the buffer cache is a global variable.
> However, those sound like refactoring for another series.

You mean structure the unmount code like we do in teh kernel? e.g:

> > I think
> > having libxfs_umount() do a proper purge -> flush and returning any
> > errors instead is a fair tradeoff for simplicity. Removing the
> > flush_devices() API also eliminates risk of somebody incorrectly
> > attempting the flush after the umount frees the buftarg structures
> > (without reinitializing pointers :P).

You mean like this code that I'm slowly working on to bring the
xfs_buf.c code across to userspace and get rid of the heap of crap
we have in libxfs/{rdwr,cache}.c now and be able to use AIO properly
and do non-synchronous delayed writes like we do in the kernel?

libxfs/init.c:
....
static void
buftarg_cleanup(
        struct xfs_buftarg      *btp)
{
        if (!btp)
                return;

        while (btp->bt_lru.l_count > 0)
                xfs_buftarg_shrink(btp, 1000);
        xfs_buftarg_wait(btp);
        xfs_buftarg_free(btp);
}

/*
 * Release any resource obtained during a mount.
 */
void
libxfs_umount(
        struct xfs_mount        *mp)
{
        struct xfs_perag        *pag;
        int                     agno;

        libxfs_rtmount_destroy(mp);

        buftarg_cleanup(mp->m_ddev_targp);
        buftarg_cleanup(mp->m_rtdev_targp);
        if (mp->m_logdev_targp != mp->m_ddev_targp)
                buftarg_cleanup(mp->m_logdev_targp);
.....

libxfs/xfs_buftarg.c:
.....
void
xfs_buftarg_free(
        struct xfs_buftarg      *btp)
{
        ASSERT(percpu_counter_sum(&btp->bt_io_count) == 0);
        percpu_counter_destroy(&btp->bt_io_count);
        platform_flush_device(btp->bt_fd, btp->bt_bdev);
	libxfs_device_close(btp->bt_bdev);
        free(btp);
}

I haven't added the error returns for this code yet - I'm still
doing the conversion and making it work.

I'll probably have to throw the vast majority of that patchset away
and start again if all this API change that darrick has done is
merged. And that will probably involve me throwing away all of the
changes in this patch series because they just don't make any sense
once the code is restructured properly....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux