Hi Matthew, Thanks for your reply! On Thu, Jan 04, 2024 at 09:07:07PM -0500, Matthew Sakai wrote: > > > On 12/28/23 14:16, Matthias Kaehlcke wrote: > > Hi, > > > > On Fri, Nov 17, 2023 at 03:59:18PM -0500, Matthew Sakai wrote: > > > This adds the admin-guide documentation for dm-vdo. > > > > > > vdo.rst is the guide to using dm-vdo. vdo-design is an overview of the > > > design of dm-vdo. > > > > > > Co-developed-by: J. corwin Coburn <corwin@xxxxxxxxxxxxxx> > > > Signed-off-by: J. corwin Coburn <corwin@xxxxxxxxxxxxxx> > > > Signed-off-by: Matthew Sakai <msakai@xxxxxxxxxx> > > > --- > > > .../admin-guide/device-mapper/vdo-design.rst | 415 ++++++++++++++++++ > > > .../admin-guide/device-mapper/vdo.rst | 388 ++++++++++++++++ > > > 2 files changed, 803 insertions(+) > > > create mode 100644 Documentation/admin-guide/device-mapper/vdo-design.rst > > > create mode 100644 Documentation/admin-guide/device-mapper/vdo.rst > > > > > > diff --git a/Documentation/admin-guide/device-mapper/vdo-design.rst b/Documentation/admin-guide/device-mapper/vdo-design.rst > > > new file mode 100644 > > > index 000000000000..c82d51071c7d > > > --- /dev/null > > > +++ b/Documentation/admin-guide/device-mapper/vdo-design.rst > > > @@ -0,0 +1,415 @@ > > > +.. SPDX-License-Identifier: GPL-2.0-only > > > + > > > +================ > > > +Design of dm-vdo > > > +================ > > > + > > > +The dm-vdo (virtual data optimizer) target provides inline deduplication, > > > +compression, zero-block elimination, and thin provisioning. A dm-vdo target > > > +can be backed by up to 256TB of storage, and can present a logical size of > > > +up to 4PB. > > [snip] > > > > + block map cache size: > > > + The size of the block map cache, as a number of 4096-byte > > > + blocks. The minimum and recommended value is 32768 blocks. > > > + If the logical thread count is non-zero, the cache size > > > + must be at least 4096 blocks per logical thread. > > > > If I understand correctly the minimum of 32768 blocks results in the 128 MB > > metadata cache mentioned in 'Tuning', which allows to access up to 100 GB > > of logical space. > > > > Is there a strict reason for this minimum? I'm evaluating to use vdo on > > systems with a relatively small vdo volume (say 4GB) and 'only' 4-8 GB of > > RAM. The 128 MB of metadata cache would be a sizeable chunk of that, which > > could make the use of vdo infeasible. > > The short answer is that VDO can often use a smaller cache than the default, > but it likely won't help in the way you want it to. > > > > +Examples: > > > + > > > +Start a previously-formatted vdo volume with 1 GB logical space and 1 GB > > > +physical space, storing to /dev/dm-1 which has more than 1 GB of space. > > > + > > > +:: > > > + > > > + dmsetup create vdo0 --table \ > > > + "0 2097152 vdo V4 /dev/dm-1 262144 4096 32768 16380" > > > > IIUC the backing device needs to be previously formatted. The formatting > > fails when the size of the backing device is < 5GB: > > > > vdoformat /dev/loop8 > > Minimum required size for VDO volume: 5063921664 bytes > > vdoformat: formatVDO failed on '/dev/loop8': VDO Status: Out of space > > > > That was with 'vdoformat' from https://github.com/dm-vdo/vdo/ > > > > It would be great if somewhat smaller devices could be supported. > > VDO was designed to handle the challenge of data deduplication in very large > storage pools. It generally is not very useful for very small pools. The > first question to ask is whether VDO can actually provide any value in the > sort of environment you're using. VDO generally takes the strategy of saving > storage space by using extra RAM and CPU cycles. In addition, VDO needs to > track a certain amount of metadata, which reduces the amount storage > available for actual user data. > > For vdoformat, the biggest consideration is the deduplication index and > other metadata, which are basically a fixed cost of about 3.5GB. In order > for VDO to be useful, VDO would have to find enough deduplication to make up > for the storage lost to VDO's metadata, so the minimum useful size of a VDO > volume is in the 8-12GB range. > > For the block map cache, decreasing the cache size may increase the > frequency of metadata writes, which generally decreases the write throughput > of the VDO device. So the tradeoff is between RAM and write speed. > > Nothing about the generic structure of VDO would prevent us from producing a > smaller VDO (and in fact we do for some testing purposes), but in a scenario > where you can only expect to save a few gigabytes through deduplication, VDO > is generally more expensive than it is worth. > > If you still think this might be worth pursuing, let me know and we can try > to work out a configuration which might suit your goals. Some more context about my use case: I'm evaluating the use of VDO for storing a hibernate image, the goal is to reduce hibernate resume time by loading less data from potentially slow storage. That's why the volume is relatively small. The image is only written once per hibernate cycle and generally after the system was idle for a longer time, so the lower write throughput due to a smaller cache size probably wouldn't be a major concern. The systems might not have huge amounts of free disk space, an overhead of ~3.5GB for the deduplication index would probably rule out the use of VDO. In the context of this use case the compression part of VDO seems more interesting than the deduplication. In the documentation of VDO I noticed a parameter to disable deduplication. With that I wonder if it would be feasible/reasonable to add an option to vdoformat to omit the deduplication index. Do you think VDO might be (made) suitable for this scenario or is it just not the right tool? Thanks Matthias