Serge E. Hallyn wrote: > Quoting Oren Laadan (orenl@xxxxxxxxxxx): >> >> Matt Helsley wrote: >>> On Wed, Oct 21, 2009 at 07:51:57PM -0500, Serge E. Hallyn wrote: >>>> Quoting Oren Laadan (orenl@xxxxxxxxxxx): >>> <snip> >>> >>>>>> More practically, requiring userspace to pass over a flag >>>>>> consisting of CKPT_DBG_MEM|CKPT_DBG|FILE|CKPT_DBG|TASK, and >>>>>> handle corresponding usage flags, is not nice. >>>>> I agree with you on about this. Maybe we want a better >>>>> interface ? >>>>> >>>>> Which brings me to this random thought: maybe we want to >>>>> make the fourth argument of sys_{checkpoint,restart} a >>>>> structure, to make it easier to extend it in the future >>>>> without having to go throw a clone3-like hell... >>> Adding new kernel interfaces is supposed to be somewhat hellish. >>> >>>>> Specifically, this structure could now be: >>>>> >>>>> struct ckpt_args { >>>>> int version; >>>>> int logfd; >>>>> int logmask; >>>>> }; >>>>> >>>>> (or use union checkpoint {} and union restart {} to tell >>>>> between checkpoint- and restart-related args. >>>> Well I don't like passing structs to the kernel actually (and >>> Let's not do this. I agree that passing structs, when unnecessary, >>> is gross. Especially if it gets used to extend the arguments >>> passed via the syscall interface (new flag values I don't mind). >> Ok, we already allow future extension by being strict about >> which flags are taken or not. >> >> Then what do we do with logmask ? I prefer it to be a per-syscall >> value as opposed to a system-wise setting. > > Ok well let's be honest with ourselves - we'd like to give end-users > reasonable debug info to figure out common problems. But we will > always have cases where we have to hand-instrument the kernel for > our own advanced debugging. Cases where you must print out only a > specific info to catch a race condition probably will fit into > such a case. So I'd prefer to aim for giving the end-user just one > or a few options - nothing, errors only, or all debugging info that > we care to insert in everyday kernels (which hopefully isn't too > much, but let's a user figure out approximately where the problem > happened, i.e. recreating a socket, opening a file, etc). Otherwise I'm ok with this nothing-error-all scheme. Two questions are: What's "reasonable debug info" ? How does it differ than looking at the image (checkpoint/restart) at the specific file position and error message ? > too much debugging will (a) deter from the readability of the code > (b) bulk up the kernel and (c) make the api cumberson (too many > flags to checkpoint/restart userspace programs, and passing structs > to syscall). > > In fact I'm wondering whether for the shipped c/r kernel we want > to have only ckpt_error()s and make sure we record the location in > the restart file where we errored out. The ckpt_write_err() info > is already very helpful in tracking down checkpoint errors. Of course > this becomes a pain whenever we introduce a subte bug and have to > hand-instrument everything. And then I have to curse all the printks > I have to add. But it's not reasonabe to have every other line of > c/r code be a debug statement either. So you suggest ckpt_error() and ckpt_debug(). Fair enough. Yet I'm not willing to give up my existing (too noisy for the user) debug statements for a chance to curse at their removal. So I suggest also _ckpt_debug(), only with CONFIG_CHECKPOINT_DEBUG. They are there for developers, they can be used by a user who is having an issue and works with a developer, and they prevent Serge from cursing unnecessarily :p (or can change the naming: ckpt_error, ckpt_log, ckpt_debug, etc). Oren. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers