Re: reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 09/10/2014 11:39 PM, Ivan Shapovalov wrote:
On Wednesday 10 September 2014 at 22:17:15, Edward Shishkin wrote:	
On 09/10/2014 09:00 PM, Ivan Shapovalov wrote:
Hi!

The preamble: recently I had to force-change my configuration (the old laptop
was stolen). What I have now is a combination of a tiny 16 GiB SSD and a huge
1 TiB HDD.

...So I've placed my /home on HDD. Partition size is 800 GiB, formatting
options are "create=ccreg40,compress=gzip1,compressMode=latt" and I have a few
questions.

1. What is the recommended compression mode?
The default one (conv).
OK, thanks.

More specifically, what is the default "conv" mode? What is its purpose, why is
it the default?
In this mode intelligent switches take place in 2 interfaces:
1) in FILE interface (if the first 64K of the file are incompressible, then
      management is passed to unix-file plugin forever);
2) in COMPRESSION interface (turn on/off compression transform
      on a dynamic lattice).

In other compression modes switches take place only in COMPRESSION
interface.


I'm asking, because I wasn't able to understand its purpose from code, and the
code itself looks hackish in some places (hardcoded fallback to extent-only
files,
Actually, this is implementation of a compression mode, not a hardcoded
fallback.


   hardcoded policy, hardcoded fallback to "latt" in many cases, etc).
ditto
Yes, I understand that this is implementation and it doesn't have an obligation
to be configurable in every aspect... but still it feels somewhat strange.
E. g. why "extents only" formatting is forced when a file is decided to be
incompressible?


"extents only" formatting policy was set to facilitate debugging process
when implementing the "conv" compression mode.

When "conv" is set, cryptcompress plugin "sends a signal" to the upper
dispatcher to perform switch to unix-file plugin, which, in turn, performs
switches in the ITEM interface, if "smart" formatting policy is installed (this is "classic" tail conversion: tails to extents, if file size >= 20K, and backward).

Setting "extents only", or "tails only" disables the switches.
Why "extents only" instead of "tails only"? When "conv" makes a decision
about the switch, the file is 64K long, so extents are better than tails.

I think that now we can set "smart" instead of "extents only": those
switches won't step on each other.


  Why the heuristic in FILE interface check (compressible only if
size can be reduced twice) is different from the one in COMPRESSION interface
(compressible if size can be reduced at all)?


I wanted to increase the portion of unix-files on the partition. It showed
better performance than the heuristics that performs switches in the
COMPRESSION interface. I still don't have satisfactory explanation of this fact.


(I'm sorry for too many questions. I'm just curious.)

2. The mount time of a 800-GiB partition is >20 seconds. And with
dont_load_bitmap it's around 1-2 seconds. Why so much?
By default all bitmap blocks are loaded to memory at mount time.
Now calculate a number of bitmap blocks for 800-GiB partition that
should be read from disk.
25 MiB of bitmaps. 20 seconds still looks strange...
Are the blocks specially processed? Don't see anything.

   Why other filesystems
have drastically less mount times? If they have an equivalent of
dont_load_bitmap enabled by default, why don't we do it?
For historical reasons. I recommended to not use large partitions
for reiser4, so there wasn't any need in this option.
OK...

3. Given a directory tree with ~20k files of total size around 20 GiB,
its removal takes forever. From strace I see that a single unlink takes
~1 second. Again, why so much? Is it related to my choice of "latt" compression
mode over the default "conv"?
Yes, in particular.
"latt" means that all file bodies are represented by fragments in
formatted nodes.
So... are all cryptcompress files stored in formatted nodes, without
any equivalent of extents?


Yes, cryptcompress files are composed of items of only one type, so-called
"ctails" (they resembles tails, but have a 1-byte header, which contain size
of file's logical cluster). Unlike unix-file plugin cryptcompress plugin doesn't
perform switches in ITEM interface.


3a. I can reproduce the "directory not empty" bug :) Interestingly, it is
always the same directory under the aforementioned huge hierarchy. (I've
done the unpack-remove cycle a few times.)
I've made a conclusion that this is caused by unexpected disappearing
of a record, which represents a directory entry in the directory item
(currently directory items are managed by cde ITEM plugin, aka "compound
directory entries"). In the error path (ENOENT) the size of the directory is
not decremented, which makes the directory undeletable. I still don't know
who kills the entries. Special debugging info is needed to find/fix it.
What kind of information is needed?


We need to find all places, where the records are created / killed
and insert a hook, which prints such events for the entry which
unexpectedly disappears. This will get us a chance to find the culprit.
I have to say: this is not a big fun...

Thanks,
Edward.
--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux File System Development]     [Linux BTRFS]     [Linux NFS]     [Linux Filesystems]     [Ext4 Filesystem]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Resources]

  Powered by Linux