Re: Data deduplication for Linux : lessfs

Roy Sigurd Karlsbakk <roy@karlsbakk.net> · Wed, 24 Jun 2009 22:09:38 +0200

On 24. juni. 2009, at 22.04, Les Mikesell wrote:

Roy Sigurd Karlsbakk wrote:
On 24. juni. 2009, at 17.12, Mark Ruijter wrote:
For those who need OpenSource data deduplication today instead of
tomorrow one might take a look at lessfs.
http://www.lessfs.com
It's a good idea, but given the current traffic on the lessfs  
mailing list, I'm not sure if much work is done. I have been a  
member of that list since June 1 and haven't received more than one  
message, which was the one I wrote myself.

I am thinking about starting to work on a data deduplicating
blockdevice, a kernel module called blockless.
If done smartly, this may perhaps be possible, but the problem is  
the filesystem's metadata. Is this going to be dedup'ed? How much  
will this take? A simple backup will update atime on all the files  
backed up, and although atime isn't always wanted or needed, the  
problem occurs elsewhere.

Block level deduplication isn't going to know/care about the  
difference between file contents and metadata.  It is either stored  
in blocks that match other blocks or not and the difference should  
not be visible to the filesystem living on top of the block device.

My point exactly. If dedup was to be done on the block layer, you'd  
need flag to say "do not dedup this".

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres  
intelligibelt. Det er et elementært imperativ for alle pedagoger å  
unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de  
fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/