On Fri, Jul 30, 2010 at 05:35:04PM -0700, Sukadev Bhattiprolu wrote: > Oren Laadan [orenl@xxxxxxxxxxxxxxx] wrote: > > > > > > john stultz wrote: > >> On Fri, 2010-07-30 at 15:45 -0400, Oren Laadan wrote: > >>> Sukadev Bhattiprolu wrote: > >>>> Oren Laadan [orenl@xxxxxxxxxxxxxxx] wrote: > >>>> | | | > h->fl_type = lock->fl_type; > >>>> | > + h->fl_type_prev = lock->fl_type_prev; > >>>> | > h->fl_flags = lock->fl_flags; > >>>> | > + if (h->fl_type & F_INPROGRESS && | > > >>>> + (lock->fl_break_time > jiffies)) > >>>> | > + h->fl_rem_lease = (lock->fl_break_time - jiffies) / HZ; > >>>> | | Hmmm -- we have a ctx->ktime_begin marking the start of the > >>>> checkpoint. > >>>> | It is used for relative-time calculations, for example, the expiry of > >>>> | restart-blocks and timeouts. I suggest that we use it here too to be > >>>> | consistent. > >>>> > >>>> Well, I started off using ktime_begin but discussed this with John Stultz > >>>> (CC'd here) who pointed out that mixing different domains of time is likely > >>>> to cause errors. ktime is an absolute time and fl_break_time uses jiffies - > >>>> a relative time. > >>>> > >>>> I think use of ktime_begin for restart_blocks is fine (since they use > >>>> ktime_t) but using ktime_t for file leases and converting between jiffies > >>>> and nanoseconds could be a problem, unless we convert fl_break_time to > >>>> seconds. > >>>> > >>>> IOW, can we leave the above computation of ->fl_rem_lease for now ? > >>> The data on restart_blocks keep relative time - it's the the time > >>> to expiry relative to ktime_begin (which is absolute). > >>> > >>> ktime_begin merely gives a reference point in time against which > >>> all other time-related values should be saved. The advantage is > >>> that all time computation are relative to the same point in time > >>> at checkpoint/restart - the time when the ktime_begin is set. It's > >>> more consistent this way. > >> > >> First, forgive me for not being very aware of the checkpoint/restart > >> code. > > > > On the contrary, forgive me if I'm stating the obvious below ... > > > >> > >> So, ktime_begin is an absolute CLOCK_MONOTONIC time, relative to the > >> time the system booted (more or less). And it represents the checkpoint > >> time, correct? > > > > As a rule, all time measurements in the checkpoint image are > > saved as relative values, using the start-of-checkpoint as the > > reference point in time (*). > > > > So at checkpoint, every absolute time value should be converted > > to a value relative to the start-of-checkpoint. At restart, every > > relative time value from the image is converted back (if needed) > > to an absolute time value using a respective start-of-restart. > > One general observation, slightly off-topic. You mention > "start-of-restart" here and ... > > > > This makes sense for the most common case, where if a process > > had 5 more seconds to sleep at the time of checkpoint, we would > > like it to have 5 more seconds to sleep after it restarts. > > ... "after it restarts" here. These two can be quite different if, > as you mention below, the C/R is a lengthy operation. Worse! I think the big concern is not the duration of checkpoint but the amount of time userspace expects to store a checkpoint before restarting. It's an arbitrary amount of time. A user could restart 50 days later. Or 100. etc. Sure, it's _unlikely_, but we (checkpoint/restart implementers) shouldn't count on seeing only "reasonable" times between completion of checkpoint and initiation of restart. We need to be robust for all the times we see. I think that's why Oren made the choices he did. Making times relative to ktime_begin certainly helps keep the time values semi-sane for those crazy arbitrary cases in addition to being nice for the "reasonable" cases. Cheers, -Matt -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html