Re: [GIT PULL] Block changes for 4.18-rc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 4, 2018 at 8:54 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> On 6/4/18 9:51 AM, Linus Torvalds wrote:
> >
> > Why the hell doesn't it just do bioset_exit() on the originals instead,
> > before freeing the mddev?
>
> CC'ing Neil to get his input on how best to clean that up, I'd
> be much more comfortable with that logic improved rather than
> just catering to the stack usage issue. Also include Arnd, as
> he had a test patch for it.

Also adding Tejun, because the only reason for that odd sequencing
seems to be that

 - we want to queue the deletion work *inside* the spinlock, because
it actually has synchronization between the workqueue and the
'all_mddevs_lock' spinlock.

 - we want to bioset_exit() the fields *outside* the spinlock.

So what it does now is to copy those big 'struct bio_set' fields
because they might go away from under it.

Tejun, the code in question is mddev_put() in drivers/md/md.c, and it
basically does

                INIT_WORK(&mddev->del_work, mddev_delayed_delete);
                queue_work(md_misc_wq, &mddev->del_work);

inside a spinlock, but then wants to do some stuff *outside* the
spinlock before that mddev_delayed_delete() thing actually gets
called.

Is there some way to "half-queue" the work - enough that a
flush_workqueue() will wait for it, but still guaranteeing that it
won't actually run until released?

IOW, what I think that code really wants to do is perhaps something like

        /* under  &all_mddevs_lock spinlock */
        .. remove from all lists so that it can't be found ..

        .. but add it to the md_misc_wq so that people will wait for it ..

                INIT_WORK_LOCKED(&mddev->del_work, mddev_delayed_delete);
                queue_work(md_misc_wq, &mddev->del_work);
        ...

        spin_unlock(&all_mddevs_lock);

        /* mddev is still guaranteed live */
        bioset_exit(&mddev->bio_set);
        bioset_exit(&mddev->sync_set);

        /* NOW we can release it */
        if (queued)
                unlock_work(&mddev->del_work);
        else
                kfree(mddev);

or something like that?

The above is *ENTIRELY* hand-wavy garbage - don't take it seriously
per se, consider it just pseudo-code for documentation reasons, not
serious in any other way.

             Linus



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux