On Thu, Aug 28, 2014 at 06:48:28PM -0400, Vasily Tarasov wrote: > This is a second request for comments for dm-dedup. > > Updates compared to the first submission: > > - code is updated to kernel 3.16 > - construction parameters are now positional (as in other targets) > - documentation is extended and brought to the same format as in other targets > > Dm-dedup is a device-mapper deduplication target. Every write coming to the > dm-dedup instance is deduplicated against previously written data. For > datasets that contain many duplicates scattered across the disk (e.g., > collections of virtual machine disk images and backups) deduplication provides > a significant amount of space savings. > > To quickly identify duplicates, dm-dedup maintains an index of hashes for all > written blocks. A block is a user-configurable unit of deduplication with a > recommended block size of 4KB. dm-dedup's index, along with other > deduplication metadata, resides on a separate block device, which we refer to > as a metadata device. Although the metadata device can be on any block > device, e.g., an HDD or its own partition, for higher performance we recommend > to use SSD devices to store metadata. > > Dm-dedup is designed to support pluggable metadata backends. A metadata > backend is responsible for storing metadata: LBN-to-PBN and HASH-to-PBN > mappings, allocation maps, and reference counters. (LBN: Logical Block > Number, PBN: Physical Block Number). Currently we implemented "cowbtree" and > "inram" backends. The cowbtree uses device-mapper persistent API to store > metadata. The inram backend stores all metadata in RAM as a hash table. > > Detailed design is described here: > > http://www.fsl.cs.sunysb.edu/docs/ols-dmdedup/dmdedup-ols14.pdf > > Our preliminary experiments on real traces demonstrate that Dmdedup can even > exceed the performance of a disk drive running ext4. The reasons are that (1) > deduplication reduces I/O traffic to the data device, and (2) Dmdedup > effectively sequentializes random writes to the data device. Hi! /me starts playing with the patches at: git://git.fsl.cs.stonybrook.edu/linux-dmdedup.git#dm-dedup-devel They seem to apply ok to 3.18-rc7, so I got to poke around long enough to have questions/comments: Is there a way for it to automatically garbage collect? I started rewriting the same block tons of times[1], but then the device filled up and all the writes stopped. If I sent the "garbage_collect" message every 15s it wouldn't wedge like that, but if I let it hang, garbage collecting didn't un-wedge the wac processes. Loading with the cowbtree backend caused a crash in target_message (dm core) with a RIP of zero when I tried to send the garbage_collect message. It would be nice if one could send discard and (optionally) do checksum verification on the read path. I'll look into adding those once I get a better grasp on what the code is doing. Fortunately dm-dedup is short. :) I suspect that this business in my_endio that uses bio_iovec to free the page isn't going to work with the iterator rework. When I tried bulk writing 128M of zeroes to the device, it blew up while trying to free_pages some nonexistent page. Fixing it to bio_for_each_segment_all() and free bvec->bv_page gets us to free the correct page, at least, but the next IO splats. Thanks for clearing out some of the BUG*()s. FYI, dm-dedupe might be an easier way to do data block checksumming for ext4, hence my interest. I ran the ext4 metadata checksum test and it managed to finish without any blowups, though xfstests was not so lucky. Amusingly the dedupe ratio was ~53 when it finished. --D [1] wac.c: http://djwong.org/docs/wac.c $ gcc -Wall -o wac wac.c $ ./wac -l 65536 -n32 -m32 -y32 -z32 -f -r $DEDUPE_DEVICE > > Dmdedup is developed by a joint group of researchers from Stony Brook > University, Harvey Mudd College, and EMC. See the documentation patch for > more details. > > Vasily Tarasov (10): > dm-dedup: main data structures > dm-dedup: core deduplication logic > dm-dedup: hash computation > dm-dedup: implementation of the read-on-write procedure > dm-dedup: COW B-tree backend > dm-dedup: inram backend > dm-dedup: Makefile changes > dm-dedup: Kconfig changes > dm-dedup: status function > dm-dedup: documentation > > Documentation/device-mapper/dedup.txt | 205 +++++++ > drivers/md/Kconfig | 8 + > drivers/md/Makefile | 2 + > drivers/md/dm-dedup-backend.h | 114 ++++ > drivers/md/dm-dedup-cbt.c | 755 ++++++++++++++++++++++++++ > drivers/md/dm-dedup-cbt.h | 44 ++ > drivers/md/dm-dedup-hash.c | 145 +++++ > drivers/md/dm-dedup-hash.h | 30 + > drivers/md/dm-dedup-kvstore.h | 51 ++ > drivers/md/dm-dedup-ram.c | 580 ++++++++++++++++++++ > drivers/md/dm-dedup-ram.h | 43 ++ > drivers/md/dm-dedup-rw.c | 248 +++++++++ > drivers/md/dm-dedup-rw.h | 19 + > drivers/md/dm-dedup-target.c | 946 +++++++++++++++++++++++++++++++++ > drivers/md/dm-dedup-target.h | 100 ++++ > 15 files changed, 3290 insertions(+), 0 deletions(-) > create mode 100644 Documentation/device-mapper/dedup.txt > create mode 100644 drivers/md/dm-dedup-backend.h > create mode 100644 drivers/md/dm-dedup-cbt.c > create mode 100644 drivers/md/dm-dedup-cbt.h > create mode 100644 drivers/md/dm-dedup-hash.c > create mode 100644 drivers/md/dm-dedup-hash.h > create mode 100644 drivers/md/dm-dedup-kvstore.h > create mode 100644 drivers/md/dm-dedup-ram.c > create mode 100644 drivers/md/dm-dedup-ram.h > create mode 100644 drivers/md/dm-dedup-rw.c > create mode 100644 drivers/md/dm-dedup-rw.h > create mode 100644 drivers/md/dm-dedup-target.c > create mode 100644 drivers/md/dm-dedup-target.h > > -- > dm-devel mailing list > dm-devel@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel