Re: [Patch 4/4] chunkd: add self-checking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 12 Jan 2010 09:21:24 -0500
Jeff Garzik <jeff@xxxxxxxxxx> wrote:

> 	sleep(n)
> 	self_check()
> 
> algorithm seems less useful to the average admin than a slightly more 
> complex one that solves the problem defined as "guarantee an object is 
> checked at least every N days."  Because, as the dataset grows beyond a 
> test database measures in megabytes, the dataset scan time will dwarf 
> the self-check sleep period.  The behavior becomes one of constant 
> scanning, with a tiny period of rest in between.

That's obvious. You also forgot to recall that your "fat" node
exhacerbates the problem, but conversely, splitting them up helps.
The single-threaded design is on purpose: it provides a crude
method of load control. But that aside, two things about your
scheme:
 - how do you select N? It is no different from n in its arbitrariness.
 - "as the data set grows beyond a test database", it is going to
   take more work to satisfy N.
What you are proposing is actually no different. It is not more
adaptive.

> It seems like either (a) tracking the total dataset size and total 
> dataset scan time, or (b) tracking and modifying per-object 
> last-self-check times, would be needed.

Well, that would be nice.

Still, we are powerless to keep the scan time down when the dataset
grows. At best, we can sacrificy the mean detection time and constrain
the power that the scan consumes.

I'm thinking about doing just that actually. But first I'm going to
implement a reporting scheme in tabled.

> Also of relevance to admins is scan start time.  This patch would have a 
> scan begin at essentially random times throughout the day or night, in 
> particular occurring during the most heavily-trafficked portion of the 
> day.  An algorithm that occurs on hour N every day (or hour N, day M of 
> each week) is much more predictable, regardless of dataset size.

So, you want the mean time to detect to be at least 12 hours.
I can do that if you insist.

> Another option is to add an administrative command "START SCAN", and 
> permit an external utility, scheduled by cron or executed by "make 
> check", to be the entity that starts the scan thread in the background. 
>   That would permit maximum flexibility for both the admin and our 
> testsuite.

We can do that too, I guess. Just don't know when.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Fedora Clound]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux