On 12/27/2009 06:59 PM, Pete Zaitcev wrote:
This patch adds a background process that periodically verifies the
integrity of all known objects. Objects that are found found faulty
are made invisible to applications. This way, only known good objects
remain visible and applications may implement integrity checking that
is run asynchronously with the self-checking. The intent is to let
the performance of checking to scale with the number of objects and
amount of data stored in the cell.
Signed-off-by: Pete Zaitcev<zaitcev@xxxxxxxxxx>
---
server/Makefile.am | 2
server/be-fs.c | 126 +++++++++++++++-
server/chunkd.h | 8 +
server/config.c | 28 +++
server/selfcheck.c | 208 ++++++++++++++++++++++++++
server/server.c | 2
test/.gitignore | 1
test/Makefile.am | 4
test/selfcheck-unit.c | 313 ++++++++++++++++++++++++++++++++++++++++
test/server-test.cfg | 2
test/test.h | 2
11 files changed, 693 insertions(+), 3 deletions(-)
OK. The objcache (patches 3 & 5) look good. I was thinking that the
"table of open files" was better suited for server/be-fs.c, thus keeping
the backend / frontend separation a bit more distinct. However, this is
a matter of taste, and I can also see advantages in how you implemented
it. So, no changes necessary, we'll how that works out.
The reason why I think about backend / frontend separation, when we only
have one storage backend, "the filesystem", is that I see a lot of merit
in a second type of backend: single-file (aka raw block device mode).
I don't want to stray too far from being able to implement a second type
of storage backend. On the flip side, of course, I don't want imaginary
future ideas to constraint us in the present.
On to this patch, self-checking. Regarding this patch's implementation,
the code appears to be correct. However, I wanted to discuss the design
a little bit.
I am definitely concerned that the current implementation does not
really scale at all with the datastore. With this change, the admin
would have to guess at the self-check period, with not much feedback
from the system about how long a scan takes, when a scan begins, or when
a scan ends. I also think a
sleep(n)
self_check()
algorithm seems less useful to the average admin than a slightly more
complex one that solves the problem defined as "guarantee an object is
checked at least every N days." Because, as the dataset grows beyond a
test database measures in megabytes, the dataset scan time will dwarf
the self-check sleep period. The behavior becomes one of constant
scanning, with a tiny period of rest in between.
It seems like either (a) tracking the total dataset size and total
dataset scan time, or (b) tracking and modifying per-object
last-self-check times, would be needed.
Also of relevance to admins is scan start time. This patch would have a
scan begin at essentially random times throughout the day or night, in
particular occurring during the most heavily-trafficked portion of the
day. An algorithm that occurs on hour N every day (or hour N, day M of
each week) is much more predictable, regardless of dataset size. That
would be pretty easy:
while (1)
calc time until next desired interval
(example: day 0, hour 5 (every sunday at 5am))
timer_init()
timer_add(scan_timer)
scan_timer()
if (scan thread exists)
log ("still scanning")
return
start scan thread
scan_thread()
dbscan()
end thread (do not loop)
Another option is to add an administrative command "START SCAN", and
permit an external utility, scheduled by cron or executed by "make
check", to be the entity that starts the scan thread in the background.
That would permit maximum flexibility for both the admin and our
testsuite.
Comments on all this? I will happily apply the objcache stuff once all
the self-checking details are settled.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html