Drives A and B have many overlapping files but I want to find out what files don't exist on each. Thwarting this is directory structure differs between the two drives, and I'm fairly certain some of the file names differ on the two drives also. Therefore I need something hash based. I started with this: $ find /brickA -type f -exec md5sum "{}" + > brickA.txt $ find /brickB -type f -exec md5sum "{}" + > brickB.txt What I need next is to: Make a copy of the files, brickAcopy.txt and brickBcopy.txt Loop: Extract each md5sum in brickA.txt, grep for it in brickAcopy.txt and brickBcopy.txt, and if it's found in both, delete the line in both files. What remains in each file are paths to files that don't exist on the other drive. This must be a solved problem, so I'm open to alternative approaches. Both drives use Btrfs, I can create snapshots and perform a "dedup" operation on those snapshots directly. Ideally the dedup would delete the files in both snapshots (i.e. it'd be considered data loss if it weren't for the snapshots) just to save time. But if necessary I'll just do a one way dedup with the two operations reversed and suffer the extra processing time. Ideas? -- Chris Murphy _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx