Re: Data deduplication in LVM?

Roy Sigurd Karlsbakk <roy@karlsbakk.net> · Wed, 10 Jun 2009 21:04:45 +0200

On 10. juni. 2009, at 20.41, Roy Sigurd Karlsbakk wrote:

Hi all

I've been reading up a little about data deduplication, and have  
been in search for an OSS filesystem with dedup without much luck.  
While testing snapshots and so on in LVM, I started wondering if  
dedup would be better off in LVM than in the filesystem. Would it be  
possible/efficient to add dedup to the LVM layer, or perhaps a layer  
above LVM? This could make dedup work for all or most of  
filesystems. Make a hash table with 4k (or whatever) blocks, make  
virtual blocks pointing to the physical blocks and run a remapping/ 
deduping job at night. If written to, copy-on-write could be used to  
increase speed.

Answering myself, it seems there can be a problem with this without a  
rather large change in the APIs. If I understand it correctly, if  
metadata is deduplicated, it may impose a rather large performance  
impact on writes, and from the block layer, how do you know what's  
metadata and what's not?

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres  
intelligibelt. Det er et elementært imperativ for alle pedagoger å  
unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de  
fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/