Description of HFS+ compression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, all. I've looked into how Mac OS X compresses files using "transparent compression".
Since I don't plan to use this data now, I've thought it may be a good idea to document my
findings, perhaps someone may implement compressed reader. It's conceptually and in
implementation very similar to zisofs.
I suppose that reader is familiar with
http://developer.apple.com/legacy/mac/library/#technotes/tn/tn1150.html
Used compression: zlib
Block size: 64K
Missing bits from TN1150.
Attributes key (big-endian):
uint16_t   | unknown | always zero
uint32_t   | cnid    | file id of parent, most likely, not checked
uint32_t   | unknown | always zero
uint16_t   | namelen | length of name
uint16_t[] | name    | name in UTF-16BE

Attributes header (start of the value in attributes key), (big-endian):
uint8[3]   | unknown | always zero
uint8_t    | type    | only 0x10 = inline is used for com.apple.decmpfs, attribute itself follows
uint32_t   | unknown | always zero
uint64_t   | size    | size of attribute

Compressed attribute header (little-endian):
uint32_t   | magic             | "fpmc"
uint32_t   | unknown           | always 3
uint32_t   | uncompressed_size | uncompressed size if inline, 8 otherwise
uint32_t   | unknown           | always 0

If there is only one block and it's small enough it's stored directly following the header.
Otherwise "# dummy\n" is stored instead and the compressed data is stored in resource fork of the file in question.

The headers Mac OS X uses to masquerade as some kind of resource:
Resource fork header (big-endian):
uint32_t  | header_size | always 0x100
uint32_t  | size        | total_compressed_size + seek_block_size + 4 + 0x100 
uint32_t  | size        | total_compressed_size + seek_block_size + 4
uint32_t  | unknown     | always 0x32
uint8_t[0xf0] | unknown | zero-filled
uint8_t   | size        | total_compressed_size + seek_block_size
It's followed by seek block starts with (little-endian)
uint32_t  | nentries    | number of entries follow
entries are (little-endian):
uint32_t  | compressed_offset (offset 0 corresponds to the nentries field)
uint32_t  | compressed_size

Follow zlib compressed blocks.

Trailer is 50 bytes of always the same contents:
0000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000010: 0000 0000 0000 0000 001c 0032 0000 636d  ...........2..cm
0000020: 7066 0000 000a 0001 ffff 0000 0000 0000  pf..............
0000030: 0000 


-- 
Regards
Vladimir 'φ-coder/phcoder' Serbinenko

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux