Deduplication kind of handles sparse files since all blocks containing only zero will get mapped to the same storage.
As soon as one of those blocks sharing storage get written to it will be written to a new block, and the usage counter of the shared block gets reduced by one. Once usage reaches zero the block is flagged for reuse. At least that is how it seems to work in the netapp wafl file system. Wafl never rewrites a block in place, it always writes to a new location. I don't know about OneFS.
Sendt fra min Sony Xperia™-smarttelefon
---- Rick Stevens skrev ----
On 04/01/2015 10:57 AM, Ranjan Maitra wrote:
> Thanks!
>
>
>> That's EMC's "OneFS" filesystem (EMC bought out Isilon).
>>
>>> On Wed, 1 Apr 2015 08:07:34 -0500 Ranjan Maitra <maitra.mbox.ignored@xxxxxxxxx> wrote:
>>>
>>>> Thanks to both Cameron and you, Bob!
>>>>
>>>> After the transfer, here is what we have, on that filesystem:
>>>>
>>>> $ du -sh kmeans --apparent-size
>>>> 154G kmeans
>>>>
>>>> $ du -sh kmeans
>>>> 628G kmeans
>>>>
>>>> So, I guess that leaves me (and others) stuck.
>>
>> Is "kmeans" on the target or the source filesystem?
>
> Sorry, this is on the target (Isilon FS). Locally (on a F21 workstation and ext4 FS) it clocks in at 154G and 159G respectively.
>
> If it's the source,
>> keep in mind that OneFS can do data dedupes (assuming it's enabled),
>> but it is a NAS device (NFS and/or SMB). I don't believe it's capable
>> of sparse files (few NAS are). The data dedupe would reduce the actual
>> storage on disk on the EMC device , but not report it as a sparse
>> filesystem
>
>
> Yes, I have been given this explanation, as well as that th block size is turned up on the isilon. This means that the size of a single file is probably 16K, rather than the typical 4K desktop file size. However, I do not have files that are that small where it would make a difference. So, I don't know.
>
> I see: the dedupe is supposed to run over weekends but I am not sure what it does.
Deduping is a process by which redundant data on a storage device is
removed. You can loosely think of it as "gzip" at the block level on
the storage device itself (although gzip is _compression_, not
deduping). Everything on the device will _appear_ normal, but the
redundancies will have been removed and less physical space used.
Here's a good explanation:
http://www.webopedia.com/TERM/D/data_deduplication.html
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer, AllDigital ricks@xxxxxxxxxxxxxx -
- AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 -
- -
- A squeegee, by any other name, wouldn't sound as funny. -
----------------------------------------------------------------------
--
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
-- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org