On Wed, Nov 26, 2014 at 01:35:34PM -0500, Vasily Tarasov wrote: > Hi Mike, Vivek, > > Sounds good, thanks for looking into this! > > At this point we don't have a dedup_checker. Could you clarify a bit > on the main use case for a cheker? Sudden power loss or accidental > corruption of metadata/ data devices? <shrug> No replies for a week, so I'll wade in. Keep in mind I'm a FS developer, so don't take my replies as necessarily matching Mike or Vivek's goals. > In dm-dedup, metadata is stored using dm's persistent-data library > (COW B-trees). Data blocks are written asynchronously with meta-data > but allocated sequentially. So, theoretically, on a sudden power loss > the state of a dm-dedup should remain consistent. Theoretically, yes. :) > But if somebody corrupts metadata/data devices manually the checker > will help. Is it the main use case? Or if the storage corrupts itself and you want/need to run a consistency checker to scrape the broken crud off the disk so that you can recover whatever's left. There are also cases such as recovering from accidental reformats (if possible); patching things up after the kernel explodes midway through some operation; fixing up the mess after your own software bugs out; and recovering when the storage miswrites blocks to the wrong place. It would also be useful to verify that a block still matches its stored hash; that for all LBN->PBN mappings there's also a hash->PBN mapping; and (optionally) to garbage collect any hash->PBN mappings. Theoretically you could also defrag the device. Maybe this can even be done in a background kernel thread (ha!), since the metadata's already sitting around in memory. > We'll definitely take a look into the verifier's code for thin and > cache targets and see how this applies to dm-dedup. Looks promising so far, aside from the things I noted in yesterday's email. Thanks for contributing all this work! --D > > Thanks, > Vasily > > On Wed, Nov 26, 2014 at 11:47 AM, Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > > On Wed, Nov 26 2014 at 11:36am -0500, > > Erez Zadok <ezk@xxxxxxxxxxxxxxxxx> wrote: > > > >> Mike, Vivek, > >> > >> Thank you for the effort and especially for adding more man-power to > >> this review. We know how busy you guys are so it’s understandable > >> that things can take a while to get started. Either way, I’ve > >> instructed my students to give this project the highest priority, > >> especially once we receive comments from you. > > > > Great. So along those lines have you guys worked on userspace tools > > that can verify/repair the ondisk metadata? > > > > That will be a prereq for upstream inclusion (at least for dm-dedup to > > become anything but "experimental"). > > > > dm-cache and dm-thin targets have these types of tools > > (thin_{check,repair}, cache_{check,repair}, etc). Upstream repo is here > > (misnamed, gets packaged into device-mapper-persistent-data rpm on > > Fedora, RHEL, CentOS, etc): > > https://github.com/jthornber/thin-provisioning-tools > > > > Mike > > > > -- > dm-devel mailing list > dm-devel@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel