[PATCH] RFC: Changing dm core data structure relationships (0/5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

This is a proposal of changing dm core structure for
simplification, SMP race avoidance and guaranteed release
ordering. Patches don't affect dm targets.

I would like to hear your opinion on this.

The core motivation for these changes are followings:
  - Device open and remove can race.
    As a result, the device we opened might be freed by someone.
  - We lost control on device when it's removed.
    e.g. if the device is opened and suspended, we no longer
    really remove the device.
  - With dm-table-store-md.patch in -mm, table has reference to md.
    So we would like to guarantee that table is freed before md.
It's achived in the third patch.

Patches are created and tested based on 2.6.16-rc6-mm1 plus:
  - dm-flush-queue-eintr.patch
  - dm-tidy-mdptr.patch
  - dm-table-store-md.patch
  - dm-md-dependency-tree-in-sysfs-add_subdirs.patch                           14-Mar-2006 18:33  3.2K
  - dm-md-dependency-tree-in-sysfs-bd_claim_by_kobj.patch                      14-Mar-2006 18:33  9.5K
  - dm-md-dependency-tree-in-sysfs-dm_deptree.patch                            14-Mar-2006 18:33  2.8K
  - dm-md-dependency-tree-in-sysfs-kobject_add_dir.patch                       14-Mar-2006 18:33  1.9K
  - dm-md-dependency-tree-in-sysfs-md_deptree.patch

Thanks,
-- 
Jun'ichi Nomura, NEC Solutions (America), Inc.

--------------------------------------------------------------------------
Description of patches

01-dm-sem.patch
  Literally replace _hash_lock with dm_sem. Make it extern.

02-put-table-after-unlock.patch
  Moving dm_table_put() which may destroy table to outside of dm_sem.
  This is for next patches which dm_sem is held is device close
  thus table destructor may otherwise deadlock.

03-hc-open-count.patch
  Moving open counter to hc. Disables removal of hc with openers.
  md->holders become purely a reference counter of md.
    * disk->private_data points to hc rather than md.
      When hc is removed, disk->private_data becomes NULL and no new open
      is possible.
    * Open count is protected by dm_sem to avoid open vs. remove race.
    * Upon hc removal, tables are released from md.
    * Table gets reference to md on creation, puts on free.
      Thus we can guarantee md is not freed before its tables are feed.
    * dm_get_md(), which maps minor to md, checks interface_ptr to avoid
      getting md being removed.

04-remove-dm-md-get.patch
    * dm_get_md() is removed. There is no user.

05-move-new-map-to-md.patch
  Moving hc->new_map to md->new_map.
    * Map loading no longer needs global dm_sem.
      Just grab md and lock map_lock.
    * Most of __find_device_hash_cell() can be replaced by find_device().

Should be applied in this order.

Concerns:
  - Is moving dm_blk_open/close to dm-ioctl.c right thing?
    Is it nice to rename dm-ioctl.c to dm-interface.c or something?
  - Does other .c files need to get hc->count?
  - dm_table_get_md() really needs dm_get()?
    We can assume md exists whenever the table exists.

--------------------------------------------------------------------
Structures:
  - mapped_device (md)
  - hash_cell (hc)
  - dm_table (table)
  - gendisk (disk)
  - request_queue (queue)
  - minor number (minor)

Relationships:
  - hc owns 1 md
  - md owns 0-2 tables, 1 disk
  - disk owns 1 queue and 1 minor number

Roles:
  - hc manages access from outside of dm (block device, ioctl).
    It has open counter. hc can be removed from list only when
    there is no opener.
  - md is a core of dm device. Access to table needs to go through md.

Locks:
  - dm_sem: hc list, gendisk->dm device binding
  - io_lock: DMF_BLOCK_IO, md->deferred, md->wait
  - map_lock: md->map, md->new_map
  - suspend_lock: DMF_SUSPENDED

Ordering:
  - Allocation/Register
       1) allocation of md, hc, disk, queue, minor can occur in any order
       2) register md to disk
       3) register disk (it eventually register minor and queue)
          <device visible from block layer>
       4) register md to hc
          <block device access enabled>
       5) register hc to hash list
          <ioctl access enabled>
       6) table allocation (load)
       7) table register (resume -> bind)
          <I/O enabled>
  - Freeing/Unregister (reordered proposal)
       [lock dm_sem]
       3) unregister hc from hash list
          <ioctl access disabled>
       4) unregister md from hc
       6) unregister md from disk
          <block device access rejected>
       [unlock dm_sem]
       1) table unregister (resume -> unbind)
          <I/O disabled>
       2) table free
       5) unregister disk
          <block device access disabled>
       7) free in any order

       (*) Moving 1) and 2) later has a benefit that we don't
           need to care for racing with new table registration.
       (*) Swapping the order of 5) and 6), because 5) eventually
           cause kobject_uevent(hotplug) which may sleep.
           It's not good thing to do in global dm_sem.
           Doing 6) makes future open to fail so it has same shutdown
           effect and faster.

--

dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux