Re: [PATCH v5 01/40] dm: add documentation for dm-vdo target

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 12/28/23 14:16, Matthias Kaehlcke wrote:
Hi,

On Fri, Nov 17, 2023 at 03:59:18PM -0500, Matthew Sakai wrote:
This adds the admin-guide documentation for dm-vdo.

vdo.rst is the guide to using dm-vdo. vdo-design is an overview of the
design of dm-vdo.

Co-developed-by: J. corwin Coburn <corwin@xxxxxxxxxxxxxx>
Signed-off-by: J. corwin Coburn <corwin@xxxxxxxxxxxxxx>
Signed-off-by: Matthew Sakai <msakai@xxxxxxxxxx>
---
  .../admin-guide/device-mapper/vdo-design.rst  | 415 ++++++++++++++++++
  .../admin-guide/device-mapper/vdo.rst         | 388 ++++++++++++++++
  2 files changed, 803 insertions(+)
  create mode 100644 Documentation/admin-guide/device-mapper/vdo-design.rst
  create mode 100644 Documentation/admin-guide/device-mapper/vdo.rst

diff --git a/Documentation/admin-guide/device-mapper/vdo-design.rst b/Documentation/admin-guide/device-mapper/vdo-design.rst
new file mode 100644
index 000000000000..c82d51071c7d
--- /dev/null
+++ b/Documentation/admin-guide/device-mapper/vdo-design.rst
@@ -0,0 +1,415 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+================
+Design of dm-vdo
+================
+
+The dm-vdo (virtual data optimizer) target provides inline deduplication,
+compression, zero-block elimination, and thin provisioning. A dm-vdo target
+can be backed by up to 256TB of storage, and can present a logical size of
+up to 4PB.

[snip]

+	block map cache size:
+		The size of the block map cache, as a number of 4096-byte
+		blocks. The minimum and recommended value is 32768 blocks.
+		If the logical thread count is non-zero, the cache size
+		must be at least 4096 blocks per logical thread.

If I understand correctly the minimum of 32768 blocks results in the 128 MB
metadata cache mentioned in 'Tuning', which allows to access up to 100 GB
of logical space.

Is there a strict reason for this minimum? I'm evaluating to use vdo on
systems with a relatively small vdo volume (say 4GB) and 'only' 4-8 GB of
RAM. The 128 MB of metadata cache would be a sizeable chunk of that, which
could make the use of vdo infeasible.

The short answer is that VDO can often use a smaller cache than the default, but it likely won't help in the way you want it to.

+Examples:
+
+Start a previously-formatted vdo volume with 1 GB logical space and 1 GB
+physical space, storing to /dev/dm-1 which has more than 1 GB of space.
+
+::
+
+	dmsetup create vdo0 --table \
+	"0 2097152 vdo V4 /dev/dm-1 262144 4096 32768 16380"

IIUC the backing device needs to be previously formatted. The formatting
fails when the size of the backing device is < 5GB:

vdoformat /dev/loop8
   Minimum required size for VDO volume: 5063921664 bytes
   vdoformat: formatVDO failed on '/dev/loop8': VDO Status: Out of space

That was with 'vdoformat' from https://github.com/dm-vdo/vdo/

It would be great if somewhat smaller devices could be supported.

VDO was designed to handle the challenge of data deduplication in very large storage pools. It generally is not very useful for very small pools. The first question to ask is whether VDO can actually provide any value in the sort of environment you're using. VDO generally takes the strategy of saving storage space by using extra RAM and CPU cycles. In addition, VDO needs to track a certain amount of metadata, which reduces the amount storage available for actual user data.

For vdoformat, the biggest consideration is the deduplication index and other metadata, which are basically a fixed cost of about 3.5GB. In order for VDO to be useful, VDO would have to find enough deduplication to make up for the storage lost to VDO's metadata, so the minimum useful size of a VDO volume is in the 8-12GB range.

For the block map cache, decreasing the cache size may increase the frequency of metadata writes, which generally decreases the write throughput of the VDO device. So the tradeoff is between RAM and write speed.

Nothing about the generic structure of VDO would prevent us from producing a smaller VDO (and in fact we do for some testing purposes), but in a scenario where you can only expect to save a few gigabytes through deduplication, VDO is generally more expensive than it is worth.

If you still think this might be worth pursuing, let me know and we can try to work out a configuration which might suit your goals.

Matt Sakai





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux