Re: BitRot notes

Vijay Bellur <vbellur@xxxxxxxxxx> · Fri, 28 Nov 2014 22:00:13 +0530

On 11/28/2014 08:30 AM, Venky Shankar wrote:
[snip]

1. Can the bitd be one per node like self-heal-daemon and other "global"
services? I worry about creating 2 * N processes for N bricks in a node.
Maybe we can consider having one thread per volume/brick etc. in a single
bitd process to make it perform better.

Absolutely.
There would be one bitrot daemon per node, per volume.

Do you foresee any problems in having one daemon per node for all volumes?

3. I think the algorithm for checksum computation can vary within the
volume. I see a reference to "Hashtype is persisted along side the checksum
and can be tuned per file type." Is this correct? If so:

a) How will the policy be exposed to the user?

Bitrot daemon would have a configuration file that can be configured
via Gluster CLI. Tuning hash types could be based on file types or
file name patterns (regexes) [which is a bit tricky as bitrot would
work on GFIDs rather than filenames, but this can be solved by a level
of indirection].

b) It would be nice to have the algorithm for computing checksums be
pluggable. Are there any thoughts on pluggability?

Do you mean the default hash algorithm be configurable? If yes, then
that's planned.

Sounds good.

c) What are the steps involved in changing the hashtype/algorithm for a
file?

Policy changes for file {types, patterns} are lazy, i.e., taken into
effect during the next recompute. For objects that are never modified
(after initial checksum compute), scrubbing can recompute the checksum
using the new hash _after_ verifying the integrity of a file with the
old hash.

4. Is the fop on which change detection gets triggered configurable?

As of now all data modification fops trigger checksum calculation.

Wish I was more clear on this in my OP. Is the fop on which checksum 
verification/bitrot detection happens configurable? The feature page 
talks about "open" being a trigger point for this. Users might want to 
trigger detection on a "read" operation and not on open. It would be 
good to provide this flexibility.

6. Any thoughts on integrating the bitrot repair framework with self-heal?

There are some thoughts on integration with self-heal daemon and EC.
I'm coming up with a doc which covers those [reason for delay in
replying to your questions ;)]. Expect the doc in in gluster-devel@
soon.

Will look forward to this.

7. How does detection figure out that lazy updation is still pending and not
raise a false positive?

That's one of the things that myself and Rachana discussed yesterday.
Should scrubbing *wait* till checksum updating is still in progress or
is it expected that scrubbing happens when there is no active I/O
operations on the volume (both of which imply that bitrot daemon needs
to know when it's done it's job).

If both scrub and checksum updating go in parallel, then there needs
to be way to synchronize those operations. Maybe, compute checksum on
priority which is provided by the scrub process as a hint (that leaves
little window for rot though) ?

Any thoughts?

Waiting for no active I/O in the volume might be a difficult condition 
to reach in some deployments.

Some form of waiting is necessary to prevent false positives. One 
possibility might be to mark an object as dirty till checksum updation 
is complete. Verification/scrub can then be skipped for dirty objects.

-Vijay

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel