Re: Data deduplication for Linux : lessfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Roy,

It's a good idea, but given the current traffic on the lessfs mailing list, I'm not sure if much work is done. I have been a member of that list since June 1 and haven't received more than one message, which was the one I wrote myself.

Almost all the traffic is on the forum - open discussion.
Only one person posted to the mailing list. ;-)

If done smartly, this may perhaps be possible, but the problem is the filesystem's metadata. Is this going to be dedup'ed? How much will this take? A simple backup will update atime on all the files backed up, and although atime isn't always wanted or needed, the problem occurs elsewhere.
Typically the meta data on production systems is approx 10%~20% of the deduplicated stored data.
Stored data is on my systems 40x less then the data written to the filesystem.

For example, from a real life backup server making dozens of backups each day:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d0p3     9.7G  2.4G  6.9G  26% /
/dev/cciss/c0d0p1      99M   23M   72M  24% /boot
tmpfs                 7.9G     0  7.9G   0% /dev/shm
/dev/cciss/c0d0p4     246G  6.0G  241G   3% /meta
/dev/cciss/c0d1p1     274G   73G  202G  27% /blockdata
/dev/cciss/c1d0p1     4.1T  1.5T  2.7T  35% /data
lessfs                4.1T  1.5T  2.7T  35% /pooldata
[root@lessfssrv pooldata]# du . -s -h
31T     .
[root@lessfssrv pooldata]# ls -alh /data/current/
total 314G
drwxr-xr-x 2 root root   26 Jun  1 00:12 .
drwxr-xr-x 6 root root   59 Jun  1 00:12 ..
-rw-r--r-- 1 root root 314G Jun 22 14:26 blockdata.tch
[root@lessfssrv pooldata]# ls -alh /meta/current/
total 1.4G
drwxr-xr-x 2 root root   63 Jun  1 00:12 .
drwxr-xr-x 6 root root   59 Jun  1 00:12 ..
-rw-r--r-- 1 root root 1.3G Jun 22 14:52 blockusage.tch
-rw-r--r-- 1 root root  89M Jun 22 14:45 dirent.tcb
-rw-r--r-- 1 root root  89M Jun 22 14:52 metadata.tcb


Mark.


roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.


_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux